Understanding Meteor Publish / Subscribe

I've got a simple app set up that shows a list of Projects. I've removed the autopublish package so that I'm not sending everything to the client.

 <template name="projectsIndex">    
   {{#each projects}}      
     {{name}}
   {{/each}}
 </template>

When autopublish was turned on, this would display all the projects:

if Meteor.isClient
  Template.projectsIndex.projects = Projects.find()

With it removed, I have to additionally do:

 if Meteor.isServer
   Meteor.publish "projects", ->
     Projects.find()
 if Meteor.isClient
   Meteor.subscribe "projects"
   Template.projectsIndex.projects = Projects.find()

So, is it accurate to say that the client-side find() method only searches records which have been published from the server-side? It's been tripping me up because I felt like I should only be calling find() once.


Collections, publications and subscriptions are a tricky area of Meteor, that the documentation could discuss in more detail, so as to avoid frequent confusion, which sometimes get amplified by confusing terminology.

Here's Sacha Greif (co-author of DiscoverMeteor) explaining publications and subscriptions in one slide:

subscriptions

To properly understand why you need to call find() more than once, you need to understand how collections, publications and subscriptions work in Meteor:

  1. You define collections in MongoDB. No Meteor involved yet. These collections contain database records (also called "documents" by both Mongo and Meteor, but a "document" is more general than a database record; for instance, an update specification or a query selector are documents too - JavaScript objects containing field: value pairs).

  2. Then you define collections on the Meteor server with

    MyCollection = new Mongo.Collection('collection-name-in-mongo')
    

    These collections contain all the data from the MongoDB collections, and you can run MyCollection.find({...}) on them, which will return a cursor (a set of records, with methods to iterate through them and return them).

  3. This cursor is (most of the time) used to publish (send) a set of records (called a "record set"). You can optionally publish only some fields from those records. It is record sets (not collections) that clients subscribe to. Publishing is done by a publish function, which is called every time a new client subscribes, and which can take parameters to manage which records to return (e.g. a user id, to return only that user's documents).

  4. On the client, you have Minimongo collections that partially mirror some of the records from the server. "Partially" because they may contain only some of the fields, and "some of the records" because you usually want to send to the client only the records it needs, to speed up page load, and only those it needs and has permission to access.

    Minimongo is essentially an in-memory, non-persistent implementation of Mongo in pure JavaScript. It serves as a local cache that stores just the subset of the database that this client is working with. Queries on the client (find) are served directly out of this cache, without talking to the server.

    These Minimongo collections are initially empty. They are filled by

    Meteor.subscribe('record-set-name')
    

    calls. Note that the parameter to subscribe isn't a collection name; it's the name of a record set that the server used in the publish call. The subscribe() call subscribes the client to a record set - a subset of records from the server collection (e.g. most recent 100 blog posts), with all or a subset of the fields in each record (e.g. only title and date). How does Minimongo know into which collection to place the incoming records? The name of the collection will be the collection argument used in the publish handler's added, changed, and removed callbacks, or if those are missing (which is the case most of the time), it will be the name of the MongoDB collection on the server.

Modifying records

This is where Meteor makes things very convenient: when you modify a record (document) in the Minimongo collection on the client, Meteor will instantly update all templates that depend on it, and will also send the changes back to the server, which in turn will store the changes in MongoDB and will send them to the appropriate clients that have subscribed to a record set including that document. This is called latency compensation and is one of the seven core principles of Meteor.

Multiple subscriptions

You can have a bunch of subscriptions that pull in different records, but they'll all end up in the same collection on the client if the came from the same collection on the server, based on their _id. This is not explained clearly, but implied by the Meteor docs:

When you subscribe to a record set, it tells the server to send records to the client. The client stores these records in local Minimongo collections, with the same name as the collection argument used in the publish handler's added, changed, and removed callbacks. Meteor will queue incoming attributes until you declare the Mongo.Collection on the client with the matching collection name.

What's not explained is what happens when you don't explicitly use added, changed and removed, or publish handlers at all - which is most of the time. In this most common case, the collection argument is (unsurprisingly) taken from the name of the MongoDB collection you declared on the server at step 1. But what this means is that you can have different publications and subscriptions with different names, and all the records will end up in the same collection on the client. Down to the level of top level fields, Meteor takes care to perform a set union among documents, such that subscriptions can overlap - publish functions that ship different top level fields to the client work side by side and on the client, the document in the collection will be the union of the two sets of fields.

Example: multiple subscriptions filling the same collection on the client

You have a BlogPosts collection, which you declare the same way on both the server and the client, even though it does different things:

BlogPosts = new Mongo.Collection('posts');

On the client, BlogPosts can get records from:

  1. a subscription to the most recent 10 blog posts

    // server
    Meteor.publish('posts-recent', function publishFunction() {
      return BlogPosts.find({}, {sort: {date: -1}, limit: 10});
    }
    // client
    Meteor.subscribe('posts-recent');
    
  2. a subscription to the current user's posts

    // server
    Meteor.publish('posts-current-user', function publishFunction() {
      return BlogPosts.find({author: this.userId}, {sort: {date: -1}, limit: 10});
      // this.userId is provided by Meteor - http://docs.meteor.com/#publish_userId
    }
    Meteor.publish('posts-by-user', function publishFunction(who) {
      return BlogPosts.find({authorId: who._id}, {sort: {date: -1}, limit: 10});
    }
    
    // client
    Meteor.subscribe('posts-current-user');
    Meteor.subscribe('posts-by-user', someUser);
    
  3. a subscription to the most popular posts

  4. etc.

All these documents come from the posts collection in MongoDB, via the BlogPosts collection on the server, and end up in the BlogPosts collection on the client.

Now we can understand why you need to call find() more than once - the second time being on the client, because documents from all subscriptions will end up in the same collection, and you need to fetch only those you care about. For example, to get the most recent posts on the client, you simply mirror the query from the server:

var recentPosts = BlogPosts.find({}, {sort: {date: -1}, limit: 10});

This will return a cursor to all documents/records that the client has received so far, both the top posts and the user's posts. (thanks Geoffrey).


Yes, the client-side find() only returns documents that are on the client in Minimongo. From docs:

On the client, a Minimongo instance is created. Minimongo is essentially an in-memory, non-persistent implementation of Mongo in pure JavaScript. It serves as a local cache that stores just the subset of the database that this client is working with. Queries on the client (find) are served directly out of this cache, without talking to the server.

As you say, publish() specifies which documents the client will have.