0

Array eficiency

Let's say we want a 'list' of values of a particular field (e.g. Text type) from records in a child table. Easy to do in a Formula field in the parent record, but is it more 'efficient' to just create an array of those values, or a text concatenation of the array?

When I say 'efficient', I mean size of data stored and also speed of Ninox processing of that data, like searching in it, adding to it or subtracting etc.

For an individual record, it'll make little difference I realise, but when large tables are involved, that's a lot of processing of those 'lists' and storage space in memory, so choosing the more 'efficient' Formula could have a noticeable effect.

Anyone know anything about the comparable 'efficiency' of those 2 data types?

6 replies

null
    • Fred
    • 6 days ago
    • Reported - view

    I was hoping others would have jumped in by now. I have no idea which is more efficient, I am now also eager to learn. Have you tried support?

      • UKenGB
      • 5 days ago
      • Reported - view

       I haven't no. Not even sure how I can do that these days.

      • Fred
      • 5 days ago
      • Reported - view

      Email support@ninox.com

    • info.39
    • 5 days ago
    • Reported - view

    Hey all,

    we are Ninox Partners that have specialized in optimizing performance in databases and troubleshooting bottlenecks, custom html/js views and many more very complex problems. Here is our take.

    It basically comes down to the usecases. There is no single answer to it and it depends on multiple factors, so lets break i down.
    1. How many read/write operations do you expect per amount of time?
    2. How long, as in character length, is the content you want to process?
    3. What is an acceptable loading time for your use case?

    Example 1:
    Lets say you have a very small amount of product groups, e.g. 5 groups, but an enormous amount of products in each group, lets say 50,000 per group. 
    You want to concatenate the name of the product, the amount and the storage lot per product in the array.
    -> This results in an enormously large array, as in lengh of characters, and if you write that to a text field, you will run into extreme performance problems trying to open that record or handling it. This is not advised. Very long texts can result in the database size exploding, especially when the change history is enabled. It can go as far as crashing some mobile devices during startup of the native apps, because the field contents are simply too large. Use a formula field instead, especially if you do not access the information often. Read the next chapter.

    Example 2: 
    You have basically the opposite of the first case. Thousands of product groups, but only a handful of products per group. 
    You want to concatenate the same info as before.
    -> This results in relatively small arrays and it is acceptable to store these in a text field. However, Read the next chapter.

    The next chapter:
    You are working in a database context so try and use the database to your advantage where possible. 
    You already have a parent-child connection (or can filter records via record ids from dynamic choice fields) so you can leverage that. 

    The basic problem I see with creating arrays to hold some intermediate content is simply that the database increases in size due to duplication of data. 
    There are a few cases where it is advisable to use such fields, but they are not the preferred solution. So lets elaborate some of them.

    If you are accessing the information you need many times per minute, for example dashboards (updated on every change of the underlying records, for every user on that dashboard), then you should prioritize performance over storage space most of the time. The best example for this is cascading calculations of the numbers for sums/averages on a KPI board. The following real life case is very common and we have seen it many times in different flavours.

    Example: 
    You are tracking your working time per in a log per task. Lets say you do an average of 15 tasks a per project and work on 5 projects a day. You are in a team of 20, all doing the same amount on average. You are working in a company with 5 locations, again all averaging the same amounts and for simple calculation you work exactly 20 days a month every month. 
    Now you want to calculate 5 key indicators on the main dashboard to show the workload/performance. 
    That means, for every number you have to do the following amount of calculations:
    15 tasks/project * 5 projects/d * 20 team members * 5 companies * 20 days/month * 12 months = 
    1,800,000 calculations per indicator displayed. 
    Multiply that by the number of users on the dashboard and you have an enormous amount of calculations which are dropping performance. 
    And the server has to do a lot of that work too, as you need to run many selects. Even superficial optimizations like do as transaction/do as server are not going to help you much, as now the server is even more strained.

    Instead, you should use trigger after update. Write targeted functions which update the chain of records upwards, but not the whole tree down again. 
    When one new record for a task tracking is created, calculate the project again and write that to a data field (number field) in the project. Then calculate the day again but do not start on the task level, instead take the existing fields from the project. Then calculate the team again but do not start at the task level, take the existing fields in the days, etc.
    That way, you are duplicating data, but only a small amount and only what is strictly necessary. This leads to a reduced number of calculations when they are actually needed, not every x seconds per user on the dashboard. 

    What is the way forward then?
    Now coming back to the main question about filtering and storing information. 
    My take is, use the database where possible. Do not write arrays/jsons unless you strictly need them (for example API-calls or edge cases). Write into the child-records directly, filter efficiently, reduce number of read/writes where possible and leverage the script as good as you can.

    Best Practice:
    - Formula fields are not data fields. Do not use them to store information that will be re-used by some other calculation of formula field, unless the length of data is huge.
    - If you have parent/child connections, use them via for-loops or dot-notation. 
    - If you can not use a parent/child connection, use select Tablename where. Do not filter select Tablename[] with brackets. You are otherwise throwing away the optimisations that the backend has implemented.
    - If you can, filter with boolean statements and avoid if else statements in the filter.
    - Filter with the statements in the following order:
    booleans - numbers - all other types - text. Text search is many thousand times slower than a boolean compare.
    - Try to order your filter arguments by the biggest filtering argument, as in the one that removes the biggest number of records from the pool first, while still obeying the aformentioned data type order. 
    - If you have low amount of read/writes and a very low amount of users, consider using do as transaction to offload some calculations on the server. Do not do that for many users/calls, as the server will then become the performance bottleneck very fast. 

    With these tips and some more case to case considerations, we have successfully reduced loading times in customer tables from >90 seconds down to 2-3 seconds, sometimes even lower. 

    The topic is endless. Every case is different, every consideration for one table or use case can be void for another. It is important to understand the task, the constraints and the expectations every time.

    Independent of you being a Ninox partner or Ninox user, we offer our expertise and you can request assistance with these and many more topics at info@goetje-ing.de.

    Greetings Philipp

    • UKenGB
    • 4 days ago
    • Reported - view

    Wow, that was an answer. Thanks for that. However, far more involved than I was thinking.

    Let's say a child table of 'Names' that are used for the parent record. A list of those names is often required elsewhere so worth creating a formula field in the parent table that contains all (typically <10) of its Names (e.g. each <16 characters).

    So my question was whether that list would be better as an array or a text concatenation of the Names. Either can be conveniently displayed without further processing, so it's just whether a formula field as an array is more 'efficient' than a text string equivalent of the array. Which is quicker for Ninox to calculate in the first place, which takes less memory and which is then quicker to display in a table or a form?

    In my case, it is strictly just the app and not a Cloud shared database and the numbers of records I am dealing with are relatively very small compared to large business systems. However, numbers don't affect which is better. Simply that larger numbers will make any difference more noticeable.

    Since I am only dealing with small datasets, any difference may be negligible, but it's still something I'd like to know about Ninox.

      • info.39
      • 4 days ago
      • Reported - view

       
      Formula fields are not "stored" in the database. Database fields, like text, numbers, dates, symbols, etc. are stored. 
      So when it comes to size of the database, using a formula is (for this argument) a net addition of zero. 
      If you write the array to a text field, you create "new data" that has to be stored. 

      When comparing different formulas to get the same result, there are better and worse ways, as always.

      Lets compare:

      Option 1:

      let myString := "";
      let myChildren := Children;
      for i in myChildren do
        myString := myString +", "+ i.Name
      end;
      

      Option 2:

      let myString := "";
      let myChildren := Children;
      let myIndex := 0;
      let myCount := cnt(myChildren);
      while myIndex < myCount do
      
        myString := myString +", "+ item(myChildren,myIndex).Name;
        myIndex := myIndex + 1
      end
      

       Option 3: 

      concat(for i in Children do
        i.Name
      end);

      Option 4: 

      concat(Children.Name);
      

      All lead to a string of names separated by a comma. Which one is better? I argue that number 3 and 4 are the best. They reduce overhead, can not run infinite due to false setup in a while loop, are easy to read and easy to write. Transferring knowledge from other languages, for-loops are some of the fastest options, unless there is some other optimization for special cases. 

      You could do the same thing for arrays. Just remove the concat() parts or writing on a variable. This basically comes down to how you want to use the data. Do you need it as an display value only? String is fine. Do you want to iterate it, filter it or otherwise process something? Array is preferred, as you can skip the split() part on the other side.

      Hope that fits your question.

Content aside

  • 4 days agoLast active
  • 6Replies
  • 71Views
  • 3 Following