Neural Network Rules

That would be a great opportunity to learn a bit more about how that works. Regarding neural networks, I only know the concept, and it really interests me. I also prefer using some only newspapers for rule validation and creation and I can point you to proper sources. I would stay away from Wikipedia and Tatooeba when creating the model, because they merge 4 different Portuguese standards, which would lower the quality of the model.

Google doesn’t have a publicly available Portuguese n-gram data set, even though I prepared Portuguese to be ready to accept n-gram data, if the user builds it or buys it online. Some disabled confusion pair can be found here:

NOTE: The header says # English confusion sets,, because I forgot to change that when creating the dummy file. The pairs are Portuguese.

The best pair for me are not there. e->é and por->pôr pairs are best suited for this rule, since pattern rules are not good detecting these confusions.

The best free corpora I know for this task is the DCEP: Digital Corpus of the European Parliament.
It is available for all major European Languages, and even some minor ones.
You can find on DCEP: Digital Corpus of the European Parliament - European Commission

If you guide me into the process, I can do some grunt work, and I will definitely give my documentation feedback if you wish to.