Lessons learned from a Node projectPublished:
During the 2020 Covid-19 pandemic, I found myself unemployed during a lull in hiring. As I spent the last nine years working on a C# desktop product, I decided it was time to refresh my skills as a web developer. Thus, I created a personal blog engine that targeted a modern NodeJS stack, Postgres database, and the Heroku hosting engine: https://github.com/GWBasic/z3.
Here are some lessons learned, and some troubles that I encountered.
The first time I used NodeJS was in 2010, when it was all callbacks. At the time, nesting callbacks within callbacks lead to very convoluted and difficult to debug code. As much as I liked NodeJS, at the time I viewed it as a toy for small utilities.
Now NodeJS has async / await and Visual Studio Code supports full debugging of code running in NodeJS. It's a mature environment!
Async Code and Promises aren't As Popular As They Should Be
Back in 2010 when I was learning about NodeJS, I went to a meetup at Yahoo's headquarters in Santa Clara, CA. Douglas Crockford presented, with rather religious zeal, about how asynchronous code was significantly superior to threaded code due to the lack of performance overhead for context switching.
During a Q&A, I pointed out that async code is much harder to read and develop in, and I asked what would happen to fix that. His answer was quite rude and flippant.
The thing that surprised me is how little NodeJS code supports promises, or is written without async / await. I did end up using some npm libraries that didn't support promises, and most examples that I come across are written with callbacks.
Express is the defacto web hosting framework for NodeJS. There are many libraries and extensions that plug into Express, greatly simplifying things. (In contrast, when I tried NodeJS in 2010, none of these libraries existed!)
Shortly before I decided to write Z3, I read through some tutorials for EmberJS because I was talking with a company that uses it. I was very impressed with EmberJS, so I initially investigated it. One thing that wasn't clear, when working through the tutorials, is that EmberJS is a complete client-side framework where all HTML is generated within the browser. I don't think this is a deliberate deception on the part of whoever wrote EmberJS's tutorials; but there are significant problems with rendering all of your HTML in the browser that it's really only appropriate for complicated single-page browser applications.
What I was looking for was a templating engine that gave all pages a common look and feel. Ideally, I wanted a templating engine where there was a single page that declared links to the css files, and generated the common headers and footers. I just wanted to generate specific content for each page and have the templating engine fill in the rest.
I found that mustache templating, and handlebar templating, are very popular with NodeJS for final rendering of HTML before sending it to the browser, but those templating engines don't do anything for a common look and feel. As a result, I created the pogon.html templating engine. It merges html on the server before returning it to the browser, providing a common look and feel to all pages in the z3 blog engine.
I ended up using an npm module, use-strict, to enable this in my code globally. Fortunately, every library that I use works fine with strict mode. I did come across one library that didn't work under strict mode, but that was just a sign (code smell) that the library had other bugs. I found a different library that was easier to work with.
Forgetting to call await
A common cause of bugs in my code was forgetting to use the await keyword. I've written a lot of C#, and the C# compiler is smart enough to warn you when you forget to use the await keyword. This lead to some hard to track down bugs and unit tests that passed when they should have failed. More on this later.
I generally try to avoid large frameworks unless there is a “strong need.” In my case, where I was mostly doing server-side rendering, there was little need for a large framework like Angular or React. On most pages, I use a small script to format dates.
There are some custom scripts for the configuration page, publishing page, and password changing page that mostly deal with input validation. If these pages got significantly more complicated I'd consider trying out a framework with smarter input validation.
JQuery is more popular than I anticipated
Rich text editing
I had a lot of trouble with the rich text editor. When I developed ObjectCloud I found a rich text editor, Nicedit, that was very easy to work with. Unfortunately, Nicedit is now end-of-life. Instead, I used CKEditor. The editor works great, but the documentation is extremely confusing. There are many features that I personally can't figure out how to enable because, even though the documentation discusses the feature, the documentation leaves out critical details or provides examples without enough context. For example, I tried to add support for the <pre> tag to the rich text editor, but gave up because none of the examples for custom styles worked.
If I were to continue maintaining Z3, I'd consider switching to a different rich text editor.
CDNs are cool
This does mean that Z3 may stop working correctly when my dependencies are end-of-life. I should probably capture a snapshot of these libraries in source code control as a backup.
From what I could determine, Mocha is the default unit test framework for NodeJS, thus I wrote my tests for Mocha. Overall, I'm very happy with Mocha.
However, as I used Visual Studio Code, I wanted a test runner that showed tests in the IDE. To do this, I used the “Test Explorer UI” and the “Mocha Test Explorer” pair of plugins. These had a lot of bugs, which ended up biting me later:
- Edited files are not saved before running tests
- If a test has unresolved promises the test passes. If these promises fail, the test appears to pass
- In a group of tests, if the first tests cleanup fails, the test passes. If the group is collapsed, the entire group appears to pass
The above problems (unresolved promises and failing cleanup) caused problems when I was refactoring. I thought most of my tests passed, but I didn't realize that many were failing. I haven't submitted these bugs yet. Thus, if you're running tests in your IDE, it's good practice to periodically run npm test in the console to make sure that the IDE isn't hiding critical failures.
Normally, when I start a hobby / learning project, I like to find an unusual or unique database to work with. I'm a big believer in two things about databases:
- Most products will do fine with a traditional SQL database
- SQL is a pain in the rear end to work with
When I worked on ObjectCloud, I had a proprietary branch that used MongoDB. I really found its syntax a joy to work with compared to SQL, but its lack of transactions and poor data integrity left a sour taste in my mouth. I personally would prefer to work with a database that has MongoDB syntax, but traditional transactional integrity.
I initially found NeDB. It an embedded database that's almost a drop-in replacement for MongoDB. Although I found it a joy to work with, it did have some problems:
- No support for promises. (But the nedb-promises library adds them.)
- Can not close a database once it's opened
The reason why I initially chose an embedded database is that traditional web hosts typically allow filesystem access; and then periodically back up the files. For example, for awhile I used a php blogging software on andrewrondeau.com that didn't need a database. Instead, it just wrote to the filesystem and relied on my web host to back up periodically.
Considering that a personal blog won't get high usage, an embedded database on a traditional web host is a perfect way to keep development and hosting simple…
…But, when I looked around for NodeJS hosting, I found that most traditional web hosts don't offer NodeJS. Perhaps it's because they use Apache, and NodeJS is fundamentally incompatible with Apache? Either way, Heroku offered free hosting, but the catch was that any writes to the file system were lost due to how it scales.
No argument on my side, because Z3 is a learning project, and Postgres is popular now, so I decided that I should switch to Postgres.
Initially, rewriting the layer that used NeDB was easy, because I had a great set of unit tests. The challenge came from the various configuration files that I wrote to the file system, and read at startup:
- User-specified config, like blog name
- Session management information
- The password
- Profile pictures
At startup, in NodeJS it's common to make blocking calls to read from the filesystem, and that's how I read things like the user-specified configuration and session management information. These were JSON files so they would be easy to adjust. But… That's not how you do things in modern Heroku! All of this configuration needed to move to the database. This required me to refactor the startup sequence. More on this later.
One challenge that I had was that I prefer using uuids for primary keys if they will be exposed to the user. I tried switching a table to uuids, but it just didn't work. Uuid support isn't enabled by default in Postgres, and I had a problem where sometimes when I enabled uuid support, it still didn't work. I suspect it's a bug in the Postgres application for MacOS. In the interest of time, I decided to stick with numerical IDs. (IDs are only exposed to the owner of the application, limiting the attack surface.)
Another thing that I took the time to learn were events in Postgres. Postgres also works as an event bus. I used this because I believed it was important to cache configuration information in RAM instead of constantly loading it from the database. In the case of running multiple web servers, the event bus in Postgres tells each server to refresh the configuration information when it changes.
One major career lesson that I've learned is to make sure that your startup order is well-defined.
A big problem that I had was that the default template for creating an ExpressJS web app constructs your application object synchronously. This is fine if you're only reading from disk, but if you need to do anything with promises, you're in trouble.
I refactored the default www file to handle a promise when constructing my application object. This allowed me to use async / await to set up the default schema, and then load configurations from the schema.
But, in general, refactoring startup to query the database was very painful. NodeJS allows making blocking calls to the filesystem at startup, which I used to read various configuration files at startup. Refactoring all of this to use the database required paying very careful attention to when initial database connections and schema management were made. It took me longer to refactor startup code away from configuration files, than it took me to refactor data access code away from NeDB.
Hosting In Heroku
So far, I haven't had many problems with Heroku. Most of my problems were bugs in my code or my libraries:
- Postgres wasn't enabled by default. This was easy to fix.
- I had to configure Express to trust Heroku's proxy…
- … But client-sessions has a strange bug where it also needs to be told to trust the proxy
- Heroku doesn't construct empty folders. This caused a problem because I scan a folder looking for user-defined themes
So, in conclusion, NodeJS is an industrial-strength modern development stack. If I were developing a professional web application, what would I do differently?
- I'd start with Postgres
- I'd take the time to learn something like WebPack