Managing Large Data in React Native

Problem: Long Load Times

This is a pretty big topic, but I'd like to give a broad overview of some of the common things to think about with big data. In React Native, it's pretty common to develop apps that require heavy loads of data to render to the user. This can bring some big challenges quickly to our development plan, and it is something beginning developers can underestimate. Downloading a lot of data up-front can cause too long of an initial load time to the user. Trying to spread out the data load so it loads on-demand, may cause lower load times up-front, but it may require more loading spinners throughout your app--making your app seem generally slow and sluggish.

Problem: Over-burdening the Device

Loading times is just one of many things to think about with large data. You also need to think about how too many network requests may affect the user's data rates on their phone service, and how it may affect their battery consumption. The more work you are putting the device through, the more power it requires. Another thing to consider is where will you store this data? Storing too much data in memory may use up too much of the device's RAM, leaving other apps with little memory to work with, or may use up all the user's RAM and crash the app. Boy, this is getting a little overwhelming :)

These sort of problems are very common in applications, and there are some techniques that can handle most large data challenges and keep your apps running smoothly. Before we decide on what approach to take though, we need to understand what we have to work with first. Below are some ways I get organized when determining what strategy to apply.

Step 1

Gather Information

It might be tempting to rush off and begin to solve things by doing them. I think most of us have this tendency, and I have learned first to stop, take a deep breath, and understand exactly what problem am I having before I try to solve it :)

Understanding the problem will also help you think about solutions, even some you may have not thought about. Here are some things I might ask myself when understanding data size challenges:

Make note of the specific areas in your app that are slow to load
Check each area with a networking tool, check load speeds, and confirm that the problem is in fact large data being downloaded and not some other issue with your app
Think about why those areas of your app are loading so much data, and if certain areas need to load some of that data at all.
Make note of how often data is being requested, which may help bring up additional flaws to the surface of general architecture

Step 2

Brainstorm your options

Once you have reviewed and understood what your app is currently doing with your data, it should help you organize yourself on what options you have for improvements. Depending on what I find, I generally would like to ask these questions:

What bits of data are essential to the user up-front, right after launching the app?
What bits of data are not essential to the user right after app launch?
Is the entire bulk of this up-front data large in size?
Does the entire bulk of this up-front data change often?
Is the API good at what it does? Does it return data that is easy to manage?

What bits of data are essential to the user up-front, right after launching the app?

In this age of fast internet and ever-shortening patience, the user of your app wants to see your app's content immediately. Users expect to not have to wait too long for an app to launch and load just to see the first screen. This makes it important to think about what data is really necessary to download right from the start. At the same time, we don't want to have the user wait for your app to load things it doesn't need to display right now. So think about what data is absolutely crucial to load at the beginning.

What bits of data are not essential to the user up-front?

Take note of data that is not essential to the user when the app launches. Is it possible to load this later? Does the API allow you to load it later?

Is the entire bulk of the "up-front" data large in size?

Once you have a clearer idea of what data you absolutely need at the start, this is a great follow-up question to ask. Exactly what you consider to be large really depends on your use case, and since we are speaking about mobile networks which greatly vary in speed, you should know a little about your target users, and their typical network speeds. Will they be mostly using WIFI or their cell's data? How fast might their WIFI or cell data be? It is probably safe to say that, any downloading that is over a couple hundred kilobytes, or shows a loading spinner to the user for more than a few seconds can be considered to be large data. This indicates that loading this bulk data every single time you launch your app might not be something your users would be thrilled about, and you might need to think about further optimizing your strategy. Which leads us to our next question...

Does the entire bulk of the "up-front" data change often?

This is an important question, because if the data never changes, then you may only need to request it just once, and never request it again -- or you might be able to request it just once a month, or even just once a week, and it may not be too bad to wait longer than usual for a download since it is an occasional inconvenience.

If that's the case we can pretty much move on to our next question. I feel it would be a good enough solution for most apps. But if the data does change often, long loads as I said, may cause users to lose interest or be frustrated with your app. We need to think deeper about this issue. Some tactics come to mind, which I will discuss later. For now it is good enough to have come to the realization that there could be a need to do some additional planning here, and we cannot simply load our data at the start without affecting the user experience. Loading data at the start is still needed, but we need to see what we can do about giving the user the illusion that it has loaded fast, if not instantly. Just give it some thought for now, and let's continue.

Is the API good at what it does? Does it return data that is easy to manage?

Sometimes you might find yourself swimming against the current, trying way too hard to make the data you are getting from the API to load efficiently. Maybe it's time to stop and think -- is the API itself a problem?

What I consider to be an API that is easy to work with in terms of how to get data will generally have architecture similar to this:

It is flexible enough to let you download a full result, or a partial result only returning the most recent changes for a set of data
It allows you to paginate certain data results
It splits data into what you can call an item's metadata, and an item's details
It can be queried/filtered

Hopefully the above has helped you think about what you could investigate and develop a strategy. Having said that, let's think about some more concrete plan of attack for our React Native app.

Solutions

I will get a bit more technical about things now that we are speaking about some concrete solutions. Just remember that these are simply ideas, and not hard truths. There are possibly other ways to go about handling data, and what I share is from my personal experience :)

Optimizing Your Initial Load Time

Load Only Necessary Data at the Start

You should focus on the API endpoints that get you only what you really need at the start of your app. This will reduce the unnecessary loading time for the user as the rest of the data your app needs is not going to be displayed or used in any way at this time. This is a bit of a no-brainer, so I won't say much else about it. Let's keep going...

Load Partial Data

If the essential data is still fairly large, and if your API allows it, another tactic you can use is to load the entire bulk of data one time, and load only new data on subsequent requests. To do this, your API must allow you to make subsequent data requests that download only new data, that has been updated since your last request. There are probably many ways to do this, but this can be typically done by providing the API with the date stamp of your last result. Providing this timestamp on a subsequent request could cause the API to check for data updated since that timestamp, and would return only from that timestamp forward, to you. Your app can then simply merge this data with the one it already has in memory. This can significantly cut down on your load time and reduce the lag time for the user.

Create the Illusion of Speed with a Cache

With the two above tactics, you may still have some load time that is unwanted and that holds the app long enough to be a problem. When you must show something immediately to the user, one solution you can implement is to cache the data locally, on the initial download. Next time the user launches the app, the cached data can be presented immediately to the user with 0 load time. In the background, you can then make an update call which downloads the partial data that has changed since the last request. Once this data is processed and merged, you can update the view. The great thing with this approach is that it removes the need for a spinner. The user sees content load immediately and can begin to navigate the app. If some new content came in through the partial update, it generally is OK since you may see just a few areas blink and re-populate. The main advantage is that the user is not waiting around and is able to see something right away.

Be Choosy About When to Load Data

Be lazy, wait until it's needed

This is a fairly straight-forward idea to understand. A good rule of thumb with data is basically to not load data you don't need until you need it. There are some screens the user will rarely visit and loading data for those before entering those screens is unnecessary. There are exceptions to this rule as probably with everything else I've mentioned. It's just a good general idea. But be creative and think about how else to optimize. Sometimes loading data early is a good thing. I'll talk about that next.

Be greedy, load ASAP

Let's flip this idea upside down now. Like I said, sometimes loading data you don't immediately need can be a good thing. I think the reason why it is ok is to reduce network calls. Sometimes spreading requests out too much can make your app feel slow because it is constantly loading things to display. So, wherever you feel you might load more without a huge affect on loading, go ahead and group calls together. It becomes a bit of an art of balancing things out here. See how you can make it perform its best.

Store stuff locally

Redux Persist

This can be part of the caching strategy to load things up immediately. This is great when you want to continue the immediate effect of caching data between launches. The user can close the app and open it again, and your app can simply load back up your locally stored data. Redux Persist is great because it interfaces with a lot of storage engines and works with Redux. Just configure your storage engines and it will rehydrate your app automatically. You may also use it to store sensitive information (such as login), using the redux-persist-sensitive-storage storage engine.

SQLite

Another common way to store locally is using SQLite. This is specially great as it allows you to load only a slice of data, based on a query. react-native-sqlite-storage is a library I've used previously. It is a great library, but the documentation is a bit hard to follow. I may do a small post about how to use it in the future.

Tactics for Efficient Processing

Memoize: Reselect for Redux

Sometimes the data we get is not in the exact format we need it for displaying to the user, or for performing some operation on. Typically we must re-process data even if the data has not changed. With Reselect, these transformations are cached, and unless something significant changes, the cached transformation is passed over to your components. This can save a lot of unnecessary code churning and make your app run at its best.

Leverage your API

The API you are working with can help tremendously in consuming just the right amount of data at just the right amount of time. Be sure to check with the API's creators or documentation and understand what you have available to you. I will run through the usual use cases.

Paginate

Most APIs have a way to retrieve large results in sets, or pages based on a size you can specify yourself. Since you typically don't show a whole lot of data all at once, you can paginate certain parts, through actual pages on the UI, or through infinite scrolling. This is an essential feature of an API that provides tons of data.

Use "metadata" endpoints

Many times, you don't need an item's full details right away. Maybe you just need to know the item's title, and a couple of general properties. Check if the API you are working with has endpoints that give you item metadata. Once the user has indicated they would like to see more details about an item, you can make a second API call for the item's full details. Usually the item's metadata will give you an item id, that you can use on the endpoint that gives you its full data.

Filter your search down

This is common in APIs that has some sort of search functionality. If your API allows for filters or queries of some kind, make sure you optimize them so that you get only the most relevant results.

Download partial results

This goes in line with what I mentioned in optimizing your initial load time. See if your API returns partial data, where only updated data is returned instead of full results where most of the results may not have changed at all. It can be a little tricky to merge new results in, based on your app's complexity. This may be made easier if your API's data elements give you a version number you can check against, or some other value. If not, work with the API creators to determine the best way to merge in partial data.

Call the API endpoint twice

This is a trick that can come in handy once in a while and works best on screens you don't access often. This helps when you have an endpoint that just gives you a piece of specific data and calling just when you need it is a bit too late -- as it won't populate anything right away and the user will get a blank screen, and when calling it on the screen may be too early -- as the data could have updated at the time the user has actually opened the screen.

What you can do is make the call prematurely (maybe the screen before the one the user could open to display its info), and then when the user opens the screen you can display this data right away. At the same time as the user has opened the screen you can make the API call again, and if something did indeed change with the new data response, it will refresh your pre-populated screen.

This may seem counter intuitive, but in cases like this, where you need to favor the perceived load speed over keeping network calls low, it works pretty well.

I hope this was informative and it has helped you brainstorm what to apply to your own projects. Good luck!