Meteor's subscription and sync are slow

I have a collection with 10M documents of 6000 stocks, stock name is indexed. When I subscribe to a new stock, meteor hangs more than 10 seconds to get about 3000 documents of this stock. Also after several stocks are subscribed, meteor hangs with 100% cpu usage. Meteor looks really slow with syncing "big" collection. Actually my app just read only. I am wondering if there is way to speed up meteor for read-only client? I am also wondering if creating a separate collection for each stock helps?


Solution 1:

Meteor is pushing the entire dataset to your client.

You can turn off autopublish by removing the autopublish package:

meteor remove autopublish

Then create specific a specific subscription for your client.

When you subscribe you can pass a session variable as an argument, so on the client you do something like:

sub = new Meteor.autosubscribe(function(){ Meteor.subscribe('channelname', getSession('filterval')); }

On the server you use the argument to filter the result set sent to the client, so that you are not piping everything all at once. You segment the data in some fashion using a filter.

Meteor.publish('channelname', function(filter){ return Collection.find({field: filter}); }

Now, whenever you change the filterval on the client using setSession('filterval', 'newvalue'); the subscription will be automatically changed, and the new dataset will sent to the client.

You can use this as a means of controlling how much and what data is sent to the client.

As another poster said, you really have to ask if this is the best tool for this job. Meteor is meant for relatively small datasets that are updated in real-time in (potentially) two directions. It is heavily optimised and has a ton of scaffolding for that use case.

For another use case (such as the read-only huge dataset) it may not make sense. It has a lot of overhead that provides functionality that you are not going to use, and you'll be coding to get the functionality that you need.

Solution 2:

I was struggling with the same issue. In my case I only had to sync ~3000 records, around 30KB total. After weeks of trying I eventually realized that the sync was not the issue, but seemingly the LiveHTML updates that happened while syncing.

I was able to reduce my page load from 10 seconds for 300 (filtered) records to less than 2 seconds for all 3000 records by disabling template updates during the initial page load. I accomplished that by adding a condition to the function that defined the template content:

Before (10s page load for 300 records being published by the server):

Template.itemlist.items = function () {
    return Item.find({type: 'car'},
                     {sort: {start: -1},
                      limit: 30});
};

To (2s page load for 3000 records published by the server):

Template.itemlist.items = function () {
    if (Session.get("active")) {    
        return Item.find({type: 'car'},
                         {sort: {start: -1},
                          limit: 30});
    } else {
        return [];
    }
};

To "activate" the session only once the data was loaded, I added:

Deps.autorun(function () {
    Meteor.subscribe("Item", 
                     {
                         onReady: function() {
                             Session.set("active", true);
                         }
                     });
});

Solution 3:

While this is a scale issue and probably can be improved; it should be noted that you are using the wrong technology for your task, because Meteor is meant for interaction between clients and not for retrieving tons of read-only time sensitive data. While a status tracking screen might still somewhat make sense, time critical data in huge amounts certainly does not...

The whole Meteor stack introduces an extreme overhead over a simple implementation in any native stack; honestly, I would even take into account the overheads Java or C# would introduce and think twice when choosing between that and low level languages like PHP and C++. Languages like Ruby, Python, Node.js and more are really a different story; they're made for rapid prototyping but in terms of latency / throughput they are behind due to the overhead it takes to JIT them, not to forget at the overhead some non-native approaches to doing things add.

TL;DR: Use the right tools for the job, or you'll cut your fingers...

Solution 4:

With autopublish enabled you may see a performance hit with large collections of documents in Mongodb. You can address this by removing autopublish and write code to only publish the relevant data instead of the entire database.

The docs also go into managing cache manually:

Sophisticated clients can turn subscriptions on and off to control how much data is kept in the cache and manage network traffic. When a subscription is turned off, all its documents are removed from the cache unless the same document is also provided by another active subscription.

Additional performance improvements to Meteor are currently being worked on, including a DDP-level proxy to support "very large number of clients". You can see more detail on this at the Meteor roadmap.