Sunday, June 22, 2014

SharePoint: Access large list

SharePoint list throttling limits to access maximum 5000 items through view/query. However your list can hold as much as 30,000,000 items but you can’t access more than 5000 items at once by view or query. But I’ve seen scenarios where list items goes millions in a year. So accessing large list having more than 5000 items not very rare in real life.

But have you ever asked yourself, why the limit is 5000 items?
The real limit of 5000 items is related to SQL server. I’m not a SQL server expert but let me try to explain as a non-SQL server expert – “If a SQL query returns more than 5000 items, SQL server locks the entire table – which will affect other queries on the table”.

So as a SharePoint expert you might need to be in a situation where you need to work with large list (having more than 5000 items). However, there’s few options you have when you need to access large list. I’ll explain different options with pros and cons.

 

Option 1: Override the throttling limit (Less preferable)

You can override the throttling limit from web application settings or through code. I would not suggest to do it from web application as it’ll be applied throughout the web application, instead if you querying your list then you can override the setting as shown below:

SPQuery spqry = new SPQuery();
spqry.QueryThrottleMode = SPQueryThrottleOption.Override;

The easy thing about this approach is that you don’t need to go through any complexity of writing complex code but downside is your query is taking excessive resources affecting other users. Points to note:

  • Easy to use, as you don’t need to include any complexity in your code if you are using SPQuery. If your list views return more than 500 items, the views will not fail
  • Also this approach will send a single (possibly) SQL query that will return all items
  • However, other users browsing the site will find the site slow – as the full table lock will be applied in SQL server delaying other users request to be queued
  • If you are loading all of the items in memory, then your server should be able to handle the items
  • You can override the throttling limit, if your server is able to handle all the items in memory and you are executing your script during off-peak hour. I would prefer overriding the throttling limit, if, for example, I’m moving all data for archiving during off peak hour. More specifically, if you need to run the script periodically or once during off-peak hour then you may be override the throttling limit.

 

Option 2: Access less than 5000 items Iteratively

With this approach, you don’t override the SharePoint throttling limit (always preferable). Rather you loop through and access less than 5000 items on each iteration (say, 2000 items per iteration). The following block shows the idea:

var list = web.Lists[listName];
var maxItemId = list.Items[list.ItemCount - 1].ID; //get max item id Or use below code to get max id using CAML
//spquery.RowLimit = 1;
//spquery.Query = "<OrderBy><FieldRef Name='ID' /></OrderBy>";
var startIndex = 1;
const int processByCount = 4000; //max no of items to be returned by spquery
var endIndex = startIndex + processByCount;
while (startIndex < maxItemId)
{
    var query = new SPQuery
    {
        Query = string.Format(@"<Where>
                            <And>
                                <Geq><FieldRef Name='ID' /><Value Type='Counter'>{0}</Value></Geq>
                                <Leq><FieldRef Name='ID' /><Value Type='Counter'>{1}</Value></Leq>
                            </And>
                        </Where>", startIndex, endIndex)
    };
    var items = web.Lists["ListName"].GetItems(query);
    //process items to convert to CSV
    startIndex = endIndex + 1;
    endIndex = startIndex + processByCount;

The above approach is a good option for querying items through SPQuery. However, for list view, you need to modify your views to make sure the view doesn’t return more than 500 items. Points to note:

  • You need to modify your existing views so that they don’t return more than 5000 items.
  • Once you modify your list views (as well as change the code to access less than 5000 items), the queries/views will have less affect on other live users.
  • However, you need to modify your existing code (or new code) to use the pattern to access not more than 5000 items.
  • Also for each loop, you will send a SQL query that will return the items.

 

Option 3: Use ContentIterator

SharePoint Server provides an API to access individual items using ‘ContentIterator’. ContentIterator is kind of complete API for processing large list items. Points to note:

  • The API has full capabilities of handling different scenarios
  • And the API is available in Office Server version not in Foundation.

 

Conclusion

Overriding throttling limit is in web application is the last option I would suggest. However overriding throttling limit through SPQuery can be used if you are running your query during off-peak hour and possibly one time. Option 2 of ‘Iteratively accessing less than 5000 items’ is more suitable as you can control how many items you would like to access on each request. ContentIterator is the preferable way, if you are using Server version of SharePoint.