Java

Bioinformatics data

Bytom@khubla.com April 28, 2014May 2, 2014

I recently had a chance to learn a little about Bioinformatics, and ended up browsing the NIH’s database of genomes here. Inside the genome data for any particular strain of a species, you’ll find various files with file extensions like “ffa”, “fna”, “ffn” and “frn”. These are FASTA files.

If you’d like an example, here’s the genomic data for a certain strain of E-coli.

The file format of FASTA files is described pretty well on the Wikipedia link. I immediately wondered how difficult it would be to read the entire files and import them into a relational database. The difficult part of this work is, of course, parsing the FASTA files. In order to support that, I wrote an ANLTR4 grammar for FASTA files. The result is here. Once the parser is built, it’s trivial to walk the AST and insert appropriate rows.

If you’re interested, the human genome is here, listed by chromosome. However, those files are in GenBank format, which is a grammar for another day.

Update: the link to the source on the Antlr4 git: antlr/grammars-v4

Java

Quadrigacx API
Bytom@khubla.com August 5, 2018

I’ve recently become very interested in blockchain, and that, naturally led me to Ethereum and Bitcoin. From there, I got a little interested in online trading, but since I prefer not to have to think about trading, I started thinking about a trading bot. Part of the work I needed to do was write a…

Read More Quadrigacx API
Java

More ANTLR Grammars
Bytom@khubla.com July 1, 2014July 1, 2014

A complete list of my ANTLR grammars HTML Redcode gff3 6502 Assembler fasta BASIC Creole Logo brainfuck

Read More More ANTLR Grammars
Java

khubla.com Java code released to Maven Central
Bytom@khubla.com September 10, 2016

I was recently asked to release some of my code to Maven Central, and therefore had to figure out how to do it. I’ve now released these khubla.com libraries: cBean Pragmatach antlr4test-maven-plugin OLMReader simpleIOC ParadoxReader The maven coordinates for each are documented on the github pages.

Read More khubla.com Java code released to Maven Central
Java

OpenAPI
Bytom@khubla.com September 19, 2020September 19, 2020

I recently had reason to get to know OpenAPI at work, so I decided to become familiar with it. I have a HomeSeer home controller at home, which exposes a JSON API, so I decided to write a Java OpenAPI server to expose the JSON API over OpenAPI. The net result is hsOpenAPI. It exposes…

Read More OpenAPI
Java

Ebean-DAO
Bytom@khubla.com July 9, 2023July 9, 2023

When building database ORM code, I prefer to use ebean, and I prefer to use DAO‘s. Since DAO’s are simple to make generic, I have a Generic DAO I use for all projects. I’ve finally decided to open-source my DAO code, and you can find it here https://github.com/teverett/ebean-dao.

Read More Ebean-DAO
Java

TNT
Bytom@khubla.com May 19, 2014

If you haven’t read this book, I highly recommend it. I discovered it in high school and finally purchased my first copy at the now-gone Duthie Books in Kitsilano. Without going into the details of the book, the author uses a simple Peano arithmetic called Typographical Number Theory (TNT) to illustrate some of his points. An example…

Read More TNT

Similar Posts

Leave a Reply Cancel reply