Count error types

Hi guys!

I have been playing around with language tool to count the error types in my student text, I’ve included my python code below, however the output I get is the following:

Number of errors: 14
Frequencies of error types: Counter({'ENGLISH_WORD_REPEAT_BEGINNING_RULE': 9, 'MASS_AGREEMENT': 2, 'HE_VERB_AGR': 1, 'CONFUSION_OF_ME_I': 1, 'MANY_TIME': 1})

Here is the sample text:

Me open account. Me want save money. Me put money in bank. Me happy. You watch movie? Me watch movie. Movie about people. People do things. Movie make me feel. Me like movie. Movie have many parts. Parts make story. Story have beginning, middle, end. Me understand story. People in movie have feeling. Feeling make me feel. Me feel happy, sad, scared. Movie make me feel. Camera show places. Places pretty. Me like to see. Me see big city, small town. Me like places in movie. Movie have music. Music make me feel. Music happy, sad, scared. Me like music in movie. Me see movie with friend. Friend talk about movie. Me talk about movie too. Me and friend like movie. Movie have many colors. Colors pretty. Me like to see. Colors make movie beautiful. Me see red, blue, yellow. Me like colors in movie. Me watch movie many time. Me understand movie better. Me see new things. Me like to see more. Movie have message. Message important. Message teach me something. Me learn from movie. Me like message in movie. Me tell other people about movie. Me say, You watch movie. Movie good. Movie make you feel. Me want other people to watch movie too.

This is just some synthetic text to show errors.

The problem is that I want to be able to list how many times an error occours in a text, and as you can see “CONFUSION_OF_ME_I” appears a lot more than once. Is this standard for Language Tool, or am I missing something in the documentation for how to do this?


import language_tool_python
from collections import Counter

def check_grammar(text):
    tool = language_tool_python.LanguageTool('en-US')
    matches = tool.check(text)
    error_count = len(matches)
    error_frequencies = Counter()

    for match in matches:
        error_frequencies[match.ruleId] += 1

    return error_count, error_frequencies

# Example text
text = "Text"

# Check grammar
error_count, error_frequencies = check_grammar(text)
print(f"Number of errors: {error_count}")
print(f"Frequencies of error types: {error_frequencies}")

Thank you!