HUGE 📄file processor

by Ricardo Fernández Serrata

Version 7 (August 19, 2021)

Download (661 downloads)

This shows how to read and process large amounts of data without causing an overflow. Thanks to buffering, loading data from storage won't blow up the memory heap.

The example data processor here is a bit flipper or byte inverter (I call it "NOTter"), it inverts all bits of each and all bytes. Of course you can use buffering for something else, like encryption, regex find-&-replace, encoding, decoding, parsing, serializating, etc...

A[1] (Block Size) must be specified in bytes, not kilobytes. It's recommended to be a power of 2 (and larger than 256). A[1] defines DD's block size and this flow's max string size. Bigger = faster but it can have diminishing returns.

Don't invert an inverted file while the original (non-inverted) file is still in the same directory with the same name. This flow always appends data instead of overwriting.

Please understand that AM is slow even for 1MB files, especially when iterating over individual bytes instead of SWords, DWords, or QWords.

Piping dd output to xxd, base64, or od cmds, is bad for retrocompatibility and memory allocation (even though storage R/W is improved), so I avoided their use. Also AM's B64 decoder doesn't support custom charsets like ISO-8859-1, so trying to decode B64 corrupts data. To avoid this, xxd and hexDecode must be used, but memory allocation gets even worse.

If you stop the flow while it's processing, the temporary file won't be deleted, you should delete any .tmp files in the /data/data/com.llamalab.automate directory

4.8 average rating from 8 reviews

5 stars
4 stars
3 stars
2 stars
1 star

Rate and review within the app in the Community section.