Update to Language Detection Notebook
Update to Language Detection Notebook
A couple of quick updates, both regarding the Jupyter notebook about language detection that I first created back in May. The first update is that I had the pleasure of presenting my project to the good people at NWA TechFest at their monthly meeting at the end of September. I got tons of insightful questions from the group who showed up and got to catch up with friends old and new. If you’re in the Northwest Arkansas area, I highly recommend showing up to their meetings, and they’re very friendly and open to ideas for presentations as well.
The second update about my language detection model is that I’ve created a second iteration on the notebook — one that was informed very much by the aforementioned questions I received. By focusing the algorithm on consonant clusters instead of just average letter distribution, I was able to improve its accuracy by about 50%, without sacrificing much in the way of performance. The next step will be to try to make it Unicode compliant — but I’ll let you look at the notebook itself for the details.