Z3 Test Blog
Andrew Rondeau
Login

Lessons learned from a Node project

Published: Wed Jun 03 2020 21:01:14 GMT+0000 (Coordinated Universal Time) (Updated: Wed Jun 10 2020 18:21:20 GMT+0000 (Coordinated Universal Time))
Share
Share

During the 2020 Covid-19 pandemic, I found myself unemployed during a lull in hiring. As I spent the last nine years working on a C# desktop product, I decided it was time to refresh my skills as a web developer. Thus, I created a personal blog engine that targeted a modern NodeJS stack, Postgres database, and the Heroku hosting engine: https://github.com/GWBasic/z3.

Here are some lessons learned, and some troubles that I encountered.

Goals

I've mostly determined that going heavy on client-side Javascript only works for client-side applications that happen to run in the browser: Gmail, Google Drive, ect. Otherwise, the web is designed around servers serving HTML. Thus, I decided that I would focus on server-side Javascript and server-side rendering.

For client-side Javascript, I decided to avoid frameworks and stick with pure Javascript. Mostly this was to avoid an additional learning curves. More on this later.

NodeJS and server-side Javascript

The first time I used NodeJS was in 2010, when it was all callbacks. At the time, nesting callbacks within callbacks lead to very convoluted and difficult to debug code. As much as I liked NodeJS, at the time I viewed it as a toy for small utilities.

Now NodeJS has async / await and Visual Studio Code supports full debugging of code running in NodeJS. It's a mature environment!

Async Code and Promises aren't As Popular As They Should Be

Back in 2010 when I was learning about NodeJS, I went to a meetup at Yahoo's headquarters in Santa Clara, CA. Douglas Crockford presented, with rather religious zeal, about how asynchronous code was significantly superior to threaded code due to the lack of performance overhead for context switching.

During a Q&A, I pointed out that async code is much harder to read and develop in, and I asked what would happen to fix that. His answer was quite rude and flippant.

But, shortly after Javascript got promises, which made async code a lot easier to follow. Next, it got async / await, which makes async code look like traditional threaded code.

The thing that surprised me is how little NodeJS code supports promises, or is written without async / await. I did end up using some npm libraries that didn't support promises, and most examples that I come across are written with callbacks.

Considering how much simpler async / await is, avoiding it baffles me. I wonder if there's a major part of the Javascript community that sees lots of callbacks as some kind of badge of honor.

Express

Express is the defacto web hosting framework for NodeJS. There are many libraries and extensions that plug into Express, greatly simplifying things. (In contrast, when I tried NodeJS in 2010, none of these libraries existed!)

One thing that did surprise me is that Express, by default, doesn't resolve promises. Considering how core they are to modern Javascript development, I was surprised that I needed to use an external library. Otherwise, Express is inconsistent in how it handles exceptions: Exceptions before a promise is generated are handled in Express, but exceptions within a promise must be handled in the route handler itself. This makes sense when writing code with callbacks or promises, but it's very counter intuitive when writing async code.

Template Engines

Shortly before I decided to write Z3, I read through some tutorials for EmberJS because I was talking with a company that uses it. I was very impressed with EmberJS, so I initially investigated it. One thing that wasn't clear, when working through the tutorials, is that EmberJS is a complete client-side framework where all HTML is generated within the browser. I don't think this is a deliberate deception on the part of whoever wrote EmberJS's tutorials; but there are significant problems with rendering all of your HTML in the browser that it's really only appropriate for complicated single-page browser applications.

What I was looking for was a templating engine that gave all pages a common look and feel. Ideally, I wanted a templating engine where there was a single page that declared links to the css files, and generated the common headers and footers. I just wanted to generate specific content for each page and have the templating engine fill in the rest.

I found that mustache templating, and handlebar templating, are very popular with NodeJS for final rendering of HTML before sending it to the browser, but those templating engines don't do anything for a common look and feel. As a result, I created the pogon.html templating engine. It merges html on the server before returning it to the browser, providing a common look and feel to all pages in the z3 blog engine.

“use strict”

One thing that caught me off guard was that I initially didn't know about “use strict.” I had a few bugs where I accidentally used an undeclared variable. In such a case, Javascript implicitly creates the variable in the global scope, causing problems if you have multiple promises running that hit the same code with the undeclared variable.

The normal way to fix this is by putting “use strict” at the beginning of a Javascript file, but such a requirement concerns me. It's very easy to accidentally use an undeclared variable and/or forget to put “use strict” at the beginning of a file. I'd much rather enable this by default in NodeJS, and require some kind of advanced setting to disable this.

I ended up using an npm module, use-strict, to enable this in my code globally. Fortunately, every library that I use works fine with strict mode. I did come across one library that didn't work under strict mode, but that was just a sign (code smell) that the library had other bugs. I found a different library that was easier to work with.

Forgetting to call await

A common cause of bugs in my code was forgetting to use the await keyword. I've written a lot of C#, and the C# compiler is smart enough to warn you when you forget to use the await keyword. This lead to some hard to track down bugs and unit tests that passed when they should have failed. More on this later.

Client-Side Javascript

I generally try to avoid large frameworks unless there is a “strong need.” In my case, where I was mostly doing server-side rendering, there was little need for a large framework like Angular or React. On most pages, I use a small script to format dates.

There are some custom scripts for the configuration page, publishing page, and password changing page that mostly deal with input validation. If these pages got significantly more complicated I'd consider trying out a framework with smarter input validation.

JQuery is more popular than I anticipated

One thing that surprised me was that I kept coming across guides that encouraged JQuery. I briefly used JQuery when I developed ObjectCloud in 2009-2010. Given that client-side Javascript frameworks come and go, I thought that frameworks like React or Angular would supplant JQuery. Furthermore, now that Internet Explorer is ancient history and browser compatibility issues aren't as prevalent with basic HTML + Javascript, there's little need for a tool like JQuery.

Instead, what I observed is that the JQuery syntax is much cleaner than things like document.GetElementById. I came across so many Javascript examples that used JQuery that I suspect it, ironically, appeals to developers who prefer “plain Javascript.”

Rich text editing

I had a lot of trouble with the rich text editor. When I developed ObjectCloud I found a rich text editor, Nicedit, that was very easy to work with. Unfortunately, Nicedit is now end-of-life. Instead, I used CKEditor. The editor works great, but the documentation is extremely confusing. There are many features that I personally can't figure out how to enable because, even though the documentation discusses the feature, the documentation leaves out critical details or provides examples without enough context. For example, I tried to add support for the <pre> tag to the rich text editor, but gave up because none of the examples for custom styles worked.

If I were to continue maintaining Z3, I'd consider switching to a different rich text editor.

CDNs are cool

Z3 loads all client-side Javascript through CDNs. It's also a thing to distribute client-side Javascript through npm, node's package manager, but that requires figuring out how to assemble client-side Javascript during the build using something like WebPack. I wanted to keep my learning curve small, so instead of checking client-side Javascript into my source tree, I decided that CDNs for open-source dependencies was the best way to go.

The advantage of a CDN, especially when a Javascript library is widely used, is that they instruct the browser to cache it for a very long time. This leads to fast page loads.

This does mean that Z3 may stop working correctly when my dependencies are end-of-life. I should probably capture a snapshot of these libraries in source code control as a backup.

If I were building an application that used significant client-side Javascript, I'd look more closely at tools like WebPack.

Testing

From what I could determine, Mocha is the default unit test framework for NodeJS, thus I wrote my tests for Mocha. Overall, I'm very happy with Mocha.

However, as I used Visual Studio Code, I wanted a test runner that showed tests in the IDE. To do this, I used the “Test Explorer UI” and the “Mocha Test Explorer” pair of plugins. These had a lot of bugs, which ended up biting me later:

  • Edited files are not saved before running tests
  • If a test has unresolved promises the test passes. If these promises fail, the test appears to pass
  • In a group of tests, if the first tests cleanup fails, the test passes. If the group is collapsed, the entire group appears to pass

The above problems (unresolved promises and failing cleanup) caused problems when I was refactoring. I thought most of my tests passed, but I didn't realize that many were failing. I haven't submitted these bugs yet. Thus, if you're running tests in your IDE, it's good practice to periodically run npm test in the console to make sure that the IDE isn't hiding critical failures.

Database

Normally, when I start a hobby / learning project, I like to find an unusual or unique database to work with. I'm a big believer in two things about databases:

  • Most products will do fine with a traditional SQL database
  • SQL is a pain in the rear end to work with

When I worked on ObjectCloud, I had a proprietary branch that used MongoDB. I really found its syntax a joy to work with compared to SQL, but its lack of transactions and poor data integrity left a sour taste in my mouth. I personally would prefer to work with a database that has MongoDB syntax, but traditional transactional integrity.

NeDb

I initially found NeDB. It an embedded database that's almost a drop-in replacement for MongoDB. Although I found it a joy to work with, it did have some problems:

The reason why I initially chose an embedded database is that traditional web hosts typically allow filesystem access; and then periodically back up the files. For example, for awhile I used a php blogging software on andrewrondeau.com that didn't need a database. Instead, it just wrote to the filesystem and relied on my web host to back up periodically.

Considering that a personal blog won't get high usage, an embedded database on a traditional web host is a perfect way to keep development and hosting simple…

Postgres

…But, when I looked around for NodeJS hosting, I found that most traditional web hosts don't offer NodeJS. Perhaps it's because they use Apache, and NodeJS is fundamentally incompatible with Apache? Either way, Heroku offered free hosting, but the catch was that any writes to the file system were lost due to how it scales.

No argument on my side, because Z3 is a learning project, and Postgres is popular now, so I decided that I should switch to Postgres.

Initially, rewriting the layer that used NeDB was easy, because I had a great set of unit tests. The challenge came from the various configuration files that I wrote to the file system, and read at startup:

  • User-specified config, like blog name
  • Session management information
  • The password
  • Profile pictures

At startup, in NodeJS it's common to make blocking calls to read from the filesystem, and that's how I read things like the user-specified configuration and session management information. These were JSON files so they would be easy to adjust. But… That's not how you do things in modern Heroku! All of this configuration needed to move to the database. This required me to refactor the startup sequence. More on this later.

One challenge that I had was that I prefer using uuids for primary keys if they will be exposed to the user. I tried switching a table to uuids, but it just didn't work. Uuid support isn't enabled by default in Postgres, and I had a problem where sometimes when I enabled uuid support, it still didn't work. I suspect it's a bug in the Postgres application for MacOS. In the interest of time, I decided to stick with numerical IDs. (IDs are only exposed to the owner of the application, limiting the attack surface.)

Another thing that I took the time to learn were events in Postgres. Postgres also works as an event bus. I used this because I believed it was important to cache configuration information in RAM instead of constantly loading it from the database. In the case of running multiple web servers, the event bus in Postgres tells each server to refresh the configuration information when it changes.

Startup

One major career lesson that I've learned is to make sure that your startup order is well-defined. 

A big problem that I had was that the default template for creating an ExpressJS web app constructs your application object synchronously. This is fine if you're only reading from disk, but if you need to do anything with promises, you're in trouble.

I refactored the default www file to handle a promise when constructing my application object. This allowed me to use async / await to set up the default schema, and then load configurations from the schema.

But, in general, refactoring startup to query the database was very painful. NodeJS allows making blocking calls to the filesystem at startup, which I used to read various configuration files at startup. Refactoring all of this to use the database required paying very careful attention to when initial database connections and schema management were made. It took me longer to refactor startup code away from configuration files, than it took me to refactor data access code away from NeDB.

Hosting In Heroku

So far, I haven't had many problems with Heroku. Most of my problems were bugs in my code or my libraries:

  • Postgres wasn't enabled by default. This was easy to fix.
  • I had to configure Express to trust Heroku's proxy…
  • … But client-sessions has a strange bug where it also needs to be told to trust the proxy
  • Heroku doesn't construct empty folders. This caused a problem because I scan a folder looking for user-defined themes

Conclusion

So, in conclusion, NodeJS is an industrial-strength modern development stack. If I were developing a professional web application, what would I do differently?

  • I'd evaluate Typescript. Javascript's weak typing grew annoying when I needed to look up values. I suspect this is because I didn't add enough documentation.
  • I'd evaluate frameworks like Angular or React, and/or use JQuery. Even when doing most rendering on the server, it's hard to avoid client-side Javascript
  • I'd start with Postgres
  • I'd take the time to learn something like WebPack
Share
Share
Lessons learned from a Node project (Wed Jun 03 2020 21:01:14 GMT+0000 (Coordinated Universal Time))
During the 2020 Covid-19 pandemic, I found myself unemployed during a lull in hiring. As I spent the last nine years working on a C# desktop product, I decided it was time to refresh my skills as a we
Read more...