title | layout | description | permalink |
---|---|---|---|
Highlights |
page |
Start here |
/start-here/ |
Here is a curated list of content, organized by topic.
Natural language processing (NLP) and computational linguistics are my primary research interests. I'm particularly into low-resource and multilingual NLP, efficient NLP, and corpus annotation.
- Towards a Tagalog NLP pipeline
- Dependency parsing for a low-resource language (Tagalog)
- Your train-test split may be doing you a disservice
- A framework for designing document processing solutions
As a spaCy developer, there are parts of the spaCy codebase that I find interesting. Here are some of my independent explorations of these aspects.
- spaCy Internals: Rules-based rules!
- spaCy Internals: Spancat architecture walkthrough
- spaCy Internals: configuration and project system
Some select topics as I go through NLP and linguistics courses. I try to make them beginner-friendly, so just read on!
- Study notes on regular expressions and finite state automata
- Study notes on making word vectors from scratch
I'm also interested in the practical application of machine learning to non-trivial problems. I find the intersection of software engineering and ML to be an exciting space, so I made a lot of notes about it!
- Navigating the MLOps landscape (three-part series): #2 #3
- How to improve software engineering skills as a researcher
- Towards data-centric machine learning: a short review
- How to use Jupyter Notebooks in 2020 (a bit deprecated)
- Why do we need Flask, Celery, and Redis? (Hacker News)
I'm a hobbyist game dev and I love creating games. I am also a fan of the Godot Engine and Pico-8. Play my games by visiting my itch.io page.