A Pages tree in a PDF should of course be a balanced tree. But PDFs exist where all the pages are in a single array, and we must deal with them. I am faced with a file with a Pages tree with more than 10,000 elements in the Kids array. Performance is terrible, gets exponentially worse as one gets towards the end of the file, and Acrobat and plug-ins appear to hang because simple APIs take a long time to complete.
I know that, if one adds pages entirely in Acrobat the pages are balanced. And I know there is no API specifically to rebalance a tree. But I was wondering if a particular API or sequence of APIs would lead to the tree being rebalanced as a side effect. I thought maybe adding a page to the end of the file would work, but that just splits the array into one small one and one with the rest of the elements (rather than a 50:50) split. Any ideas? I'm trying various things, but waiting around to see if they work is rather tedious. (I suspect that adding a page after each 100 pages, then deleting them, might have the desired effect.)