Forum

Thread tagged as: Question, Problem, Runway

Slow count on large collection

I am having issues with performance when it comes to large collections (between 500-1000 entries)

First of all, when I get the collection without any filter (but with a count-limit) it takes way longer than without any filter. My way around it is to add a dummyfilter, as you can see below in the diagnostics (idx.indexValue != 'asd'). This shaves off a huge amount of time. (6 seconds as a rough average)

But even with that, it doesn't perform as fast as I want, and the problem seems to be the counting (SELECT FOUND_ROWS() AS count)

I've optimised my indexing, and both the revisions and index tables have been flushed and kept at minimum. One of the collections has 1 related block (but only one way), and the other has none. The latter performs about a second better as a rough average (but also has 200-300 less entries)

If I actually use a filter (i.e year) and it pulls in way more entries than the "last 30", it performs WAY faster. That's the performance I want. But say I use that workaround – come january 2018, its not very optimal showing 0 listing and having the entries from barely weeks ago being hidden.

Anyone with input on this would be much appreciated!

Time    Δ   Debug Message - Perch Runway 3.0.10
0,0293  0   [1] SELECT u.*, r.* FROM perch3_users u, perch3_user_roles r WHERE u.roleID=r.roleID AND u.userEnabled=1 AND u.userID=1 AND u.userHash='3c549a6ca7681ee06cfa7356c6528451' LIMIT 1
0,0383  0,0089  UPDATE perch3_users SET userHash='29770ae7041e7a695368e29613c02358' WHERE userID='1'
0,0398  0,0015  [30] SELECT p.pagePath, pr.routePattern, pr.routeRegExp, p.pageTemplate, pr.routeOrder, s.settingValue AS siteOffline FROM perch3_pages p LEFT JOIN perch3_page_routes pr ON p.pageID=pr.pageID LEFT JOIN perch3_settings s ON s.settingID='siteOffline' UNION SELECT NULL AS pagePath, pr2.routePattern, pr2.routeRegExp, pr2.templatePath AS pageTemplate, pr2.routeOrder, NULL AS siteOffline FROM perch3_page_routes pr2 WHERE templateID!=0 ORDER BY routeOrder ASC, pagePath ASC
0,0421  0,0024  Matched page: /saker/rikssaker, so not using routes.
0,0422  0,0001  Using master page: /templates/pages/sak.php
0,0447  0,0025  [1] SELECT * FROM perch3_pages WHERE pagePath='/saker/rikssaker' LIMIT 1
0,2423  0,1976  Using template: /templates/pages/attributes/default.html
0,2441  0,0018  Using sub-template: /templates/pages/attributes/seo.html
0,2453  0,0012  [17] SELECT * FROM perch3_pages WHERE pageNew=0 AND pageHidden=0 AND pageDepth >=0 AND pageDepth<=2 ORDER BY pageTreePosition ASC
0,246   0,0007  [1] SELECT pageTreePosition FROM perch3_pages WHERE pagePath='/saker/rikssaker' LIMIT 1
0,2681  0,0221  [2] SELECT pageID FROM perch3_pages WHERE pageTreePosition IN ('000-003-001', '000-003', '000') ORDER BY pageTreePosition DESC
0,2809  0,0128  [4] Using template: /templates/navigation/item.html
0,2868  0,0059  [1] SELECT collectionID, collectionTemplate FROM perch3_collections WHERE collectionKey='Saker'
0,29            0,0032  [30] SELECT SQL_CALC_FOUND_ROWS DISTINCT * FROM ( SELECT idx.itemID, ci.collectionID, ci.itemJSON, idx2.indexValue as sortval FROM perch3_collection_index idx JOIN perch3_collection_items ci ON idx.itemID=ci.itemID AND idx.itemRev=ci.itemRev AND idx.collectionID=ci.collectionID JOIN perch3_collection_revisions cr ON idx.itemID=cr.itemID AND idx.itemRev=cr.itemRev AND idx.collectionID=ci.collectionID JOIN perch3_collection_index idx2 ON idx.itemID=idx2.itemID AND idx.itemRev=idx2.itemRev AND idx2.indexKey='ID' WHERE (idx.collectionID=1) AND ((idx.indexKey='ID' AND idx.indexValue != 'asd')) AND idx.itemID=idx2.itemID AND idx.itemRev=idx2.itemRev ) as tbl GROUP BY itemID, itemJSON, sortval ORDER BY sortval DESC LIMIT 0, 30
2,4391  2,1491  [1] SELECT FOUND_ROWS() AS `count`
2,4456  0,0064  [30] Using template: /templates/content/sak-liste.html
2,4639  0,0184  [17] SELECT * FROM perch3_pages WHERE pageNew=0 AND pageHidden=0 AND pageDepth >=0 AND pageDepth<=2 ORDER BY pageTreePosition ASC
2,4736  0,0097  [1] SELECT pageTreePosition FROM perch3_pages WHERE pagePath='/saker/rikssaker' LIMIT 1
2,4805  0,0068  [2] SELECT pageID FROM perch3_pages WHERE pageTreePosition IN ('000-003-001', '000-003', '000') ORDER BY pageTreePosition DESC
2,4813  0,0008  [4] Using template: /templates/navigation/sidekart.html
2,4871  0,0059  Request time: 2,4871
2,4871  0   Process time: 2,4871
2,4871  0   Memory: 1,6653
Eddie Dale

Eddie Dale 0 points

  • 3 years ago
Drew McLellan

Drew McLellan 2638 points
Perch Support

Can you show us your perch_collection() call?

Sure!

perch_collection('Saker', array(
                'template'  =>  'sak-liste.html',
                'count'     =>  30,
                'sort'      => 'ID',
            'sort-order' => 'DESC',
                'paginate' => true,
            'pagination-var' => 'side',
            'filter'    =>  'ID',
                'match'     =>  'neq',
                'value'     =>  'asd'
            ));

Looking at the sql-commands alone:

  • This takes most of the time around 2.5s
SELECT SQL_CALC_FOUND_ROWS DISTINCT * FROM ( SELECT idx.itemID, ci.collectionID, ci.itemJSON, idx2.indexValue as sortval FROM perch3_collection_index idx JOIN perch3_collection_items ci ON idx.itemID=ci.itemID AND idx.itemRev=ci.itemRev AND idx.collectionID=ci.collectionID JOIN perch3_collection_revisions cr ON idx.itemID=cr.itemID AND idx.itemRev=cr.itemRev AND idx.collectionID=ci.collectionID JOIN perch3_collection_index idx2 ON idx.itemID=idx2.itemID AND idx.itemRev=idx2.itemRev AND idx2.indexKey='ID' WHERE (idx.collectionID=1) AND ((idx.indexKey='ID' AND idx.indexValue != 'asd')) AND idx.itemID=idx2.itemID AND idx.itemRev=idx2.itemRev ) as tbl GROUP BY itemID, itemJSON, sortval ORDER BY sortval DESC LIMIT 0, 30

//accompanied collection function:
perch_collection('Saker', array(
                'template'  =>  'sak-liste.html',
                'count'     =>  30,
                'sort'      => 'ID',
                'sort-order' => 'DESC',
                'paginate' => true,
                'pagination-var' => 'side',
                'filter'    =>  'ID',
                'match'     =>  'neq',
                'value'     =>  'asd'
            ));
  • This takes around 25-30s. That is; if I do not use the part of the call with the filter
SELECT SQL_CALC_FOUND_ROWS DISTINCT * FROM ( SELECT idx.itemID, ci.collectionID, ci.itemJSON, idx2.indexValue as sortval FROM perch3_collection_index idx JOIN perch3_collection_items ci ON idx.itemID=ci.itemID AND idx.itemRev=ci.itemRev AND idx.collectionID=ci.collectionID JOIN perch3_collection_revisions cr ON idx.itemID=cr.itemID AND idx.itemRev=cr.itemRev AND idx.collectionID=ci.collectionID JOIN perch3_collection_index idx2 ON idx.itemID=idx2.itemID AND idx.itemRev=idx2.itemRev AND idx2.indexKey='ID' WHERE (idx.collectionID=1) AND idx.itemID=idx2.itemID AND idx.itemRev=idx2.itemRev ) as tbl GROUP BY itemID, itemJSON, sortval ORDER BY sortval DESC LIMIT 0, 30

//accompanied collection function:
perch_collection('Saker', array(
                'template'  =>  'sak-liste.html',
                'count'     =>  30,
                'sort'      => 'ID',
                'sort-order' => 'DESC',
                'paginate' => true,
                'pagination-var' => 'side',
            ));
  • But lastly, if i skip the count and go by year. ie. 2016, and instead let all the 114 rows show, the query only takes around 500ms
SELECT * FROM ( SELECT idx.itemID, ci.collectionID, ci.itemJSON, idx2.indexValue as sortval FROM perch3_collection_index idx JOIN perch3_collection_items ci ON idx.itemID=ci.itemID AND idx.itemRev=ci.itemRev AND idx.collectionID=ci.collectionID JOIN perch3_collection_revisions cr ON idx.itemID=cr.itemID AND idx.itemRev=cr.itemRev AND idx.collectionID=ci.collectionID JOIN perch3_collection_index idx2 ON idx.itemID=idx2.itemID AND idx.itemRev=idx2.itemRev AND idx2.indexKey='ID' WHERE (idx.collectionID=1) AND ((idx.indexKey='aar' AND idx.indexValue=2016)) AND idx.itemID=idx2.itemID AND idx.itemRev=idx2.itemRev ) as tbl GROUP BY itemID, itemJSON, sortval ORDER BY sortval DESC

//accompanied collection function:
perch_collection('Saker', array(
                    'template'  =>  'sak-liste.html',
                    'sort'      => 'ID',
                    'sort-order' => 'DESC',
                    'filter'    =>  'aar',
                    'match'     =>  'eq',
                    'value'     =>  perch_get('id')
                ));
Drew McLellan

Drew McLellan 2638 points
Perch Support

Are you indexing fields in your template that you never need to sort or filter by?

Not really. Only indexing the ones I believe necessary.

On the slowest collection that is 12 rows per item. (3 of them being the _date, _order and _id).

On the quickest (but with similar problems) only 2 fields are indexed other than the obligatory 3 _x's. But it also has (1) fewer entries, (2) way fewer input-fields in the collection and (3) no related blocks.

Ok, so I believe I've narrowed it down to the hosting provider having a slow database. Locally it runs way faster.

But still, a query without any filter runs at 900ms, whereas with a filter (even with no purpose) it runs at 50ms. 900ms is a bit much still, but its not an issue to add a dummyfilter for the performance boost. Just makes me think that the non-filtered query could be optimised.

sql server of hosting provider if interesting: (seems a bit old) MySQL client: mysqlnd 5.0.12-dev MySQL server: 5.5.5-10.0.32-MariaDB-1~wheezy

Drew McLellan

Drew McLellan 2638 points
Perch Support

Sure, you'll always trade some performance for the flexibility of being able to throw a completely flexible schema at a system and have it filter and sort any way you want it to. Nothing ever comes for free, it's all trade offs in one direction or another.

Have you tried the filter-mode option?

'filter-mode' => 'ungrouped',

Sure. That makes sense.

I can't see any difference to the sql query being ran when using the ungrouped option. I've tried comparing with or without it, both with my id!='asd' solution and without. It seems to group by in the query either way.

That being said, I am happy with the current solution. As long as I have a filter on, it performs very well locally, and decently on live server. I just have to find a better host to get the last bit of performance boost. Potentially also having even fewer indexed fields might help.

Much appreciated the help!