I’m sketching out an idea for a readability assessment program. It will report the education level required to comfortably read a body of text using formulas, Dale-Chall being the most significant, that count length of sentences, what level of vocab a word is considered to be, etc. I was inspired by the word counter website I always paste my essays into. When it’s done, I would like to plug it into APIs for it to be used on Lemmy, Mastodon, and Discord.
parenti quote
The quote
In the United States, for over a hundred years, the ruling interests tirelessly propagated anticommunism among the populace, until it became more like a religious orthodoxy than a political analysis. During the Cold War, the anticommunist ideological framework could transform any data about existing communist societies into hostile evidence. If the Soviets refused to negotiate a point, they were intransigent and belligerent; if they appeared willing to make concessions, this was but a skillful ploy to put us off our guard. By opposing arms limitations, they would have demonstrated their aggressive intent; but when in fact they supported most armament treaties, it was because they were mendacious and manipulative. If the churches in the USSR were empty, this demonstrated that religion was suppressed; but if the churches were full, this meant the people were rejecting the regime’s atheistic ideology. If the workers went on strike (as happened on infrequent occasions), this was evidence of their alienation from the collectivist system; if they didn’t go on strike, this was because they were intimidated and lacked freedom. A scarcity of consumer goods demonstrated the failure of the economic system; an improvement in consumer supplies meant only that the leaders were attempting to placate a restive population and so maintain a firmer hold over them. If communists in the United States played an important role struggling for the rights of workers, the poor, African-Americans, women, and others, this was only their guileful way of gathering support among disfranchised groups and gaining power for themselves. How one gained power by fighting for the rights of powerless groups was never explained. What we are dealing with is a nonfalsifiable orthodoxy, so assiduously marketed by the ruling interests that it affected people across the entire political spectrum.
– Michael Parenti, Blackshirts And Reds
I am a bot, and this action was performed automatically. Please contact the admins of this instance if you have any questions or concerns.
Bots are allowed but you should mark a bot account as such from https://lemmygrad.ml/settings so that people know that the activity is from a bot.
There is a HTTP API. The docs for it are supposed to be at https://join-lemmy.org but the site seems to be down right now.
I made a bot library awhile back. I need to update it for 0.18.3, but it should simplify things more than managing the API requests directly.
I’ll probably use this, if I get that far with this project. Since I plan for the bots to be the last thing I add, after the CLI can do everything I want it to. Thank you!
BTW, do you guys think I should use databases for this? The one formula uses a list of 4,000 easy words, and storing lists of common proper nouns will help with flagging them. Also, I could probably get vocab level data for tens of thousands of words… better in a DB than a ginormous hash table or trie?
With that small of a dataset imo either option is fine. If it were me I would use an ORM + sqlite just to start, in case I ever needed to migrate to a “real” database.
Thank you!
ORM + sqlite
I am writing in C (the CLI, which I’ll just have the bots use) and have never used any databases, would using the sqlite interface straightup with C and some cursory reading of docs be too much, do you think? Course I can switch it all to c++ and then there appears to be at least one nice ORM
I think if you’re storing vocabulary etc, using the C interface for sqlite wouldn’t be too unwieldy and would be a good learning experience if you haven’t done much raw SQL query writing of your own. Even when you use an ORM there are often times you need to write your own queries for more complicated situations.
One other suggestion: once you have the CLI and bots working, you could abstract this even more. Have a service process that communicates in some way (IPCC, a network port, etc.) that does the actual text analysis. Your cli and bots can then just interface over that channel. This gives separation of duties so you can easily implement new clients/servers or rework them much more easily.