What Is a Node.js Stream?
© https://nodejs.org/en/

What Is a Node.js Stream?

Limitations, benefits and use cases for Node.js streams

By Mario Kandut

Europe’s developer-focused job platform

Let companies apply to you

Developer-focused, salary and tech stack upfront.

Just one profile, no job applications!

Streams are a built-in feature in Node.js and represent asynchronous flow of data. Streams are also a way to handle reading and/or writing files. A Node.js stream can help process large files larger than the free memory of your computer, since it processes the data in small chunks.

This article is based on Node v16.14.0.

Streams in Node.js

💰 The Pragmatic Programmer: journey to mastery. 💰 One of the best books in software development, sold over 200,000 times.

This is the first article of a series about streams in Node.js. It aims to give an overview of different types of streams and what the limitations, benefits and use-cases are.

Streams in Node.js

What are streams?

Streams are an interface for working with streaming data. Think of a Unix pipe | as a mental model of streams. Essentially, a stream is a collection of data, which isn't available at once. The streamed data arrives in small chunks. As a result we handle each data chunk when it arrives asynchronously.

In Node.js streams are used in many built-in modules to handle async data processing, for example, the http module uses streaming interfaces with ClienRequest and ServerResponse. Stream data is a buffer by default, unless it is configured to with objects. This means it helps to buffer the data in memory.

Why use streams?

Streams let us work with data that is too large to fit into memory. We can work with a chunk of data at a time. For instance, you are working with a 50gb file of analytics data with millions of rows. If you read this file into memory, it will take very long and eventually hit the memory limit of Node.js or of your local machine. Handling this file with a stream, we can process each row from the dataset at a time and don't have to read the file into memory. Hence, streams are memory efficient.

Streams are also useful in other scenarios. For example reading a large file into memory (assuming it fits), it would take some time to be readable. When consuming data from a stream, it's readable the moment a chunk of data arrives. This means streams are time efficient compared to reading data into memory.

Streams can be combined to and with other streams. For instance, the output of one stream can be used as the input for another stream. This allows us to combine streams into a pipeline through which data can flow between the streams. Hence, streams are composable.

Types of streams

There are 5 types of streams in the built-in stream module of Node.js. docs

  • Readable: You receive data from a readable stream.
  • Writable: You stream data to a writable stream. Also, referred as sink, because it is the end-destination of streaming data.
  • Duplex: A duplex stream implements both interfaces - readable and writable. An example for duplex stream is a TCP socket, where data flows in both directions.
  • Transform: Transform stream is a type of duplex stream, where the passing through data is transformed. So, the output will be different from the input. Data can be send to a transform stream, and read after it has been transformed.
  • PassThrough: The PassThrough stream is a Transform stream, but doesn't transform data when passed through. It's mainly used for testing and examples.

Out in the wild there is a high possibility you will encounter readable, writeable and transform streams.

Stream Events

All streams are instances of EventEmitter. EventEmitters are used to emit and respond to events asynchronously. Read more about EventEmitters in the article Event Emitters in Node.js. Events emitted by streams can be used to read and/or write data, manage the stream state, and handle errors.

Though streams are instances of EventEmitter it is not recommended, to handle streams like events and just listen to the events. Instead, the recommended way is to use the pipe and pipeline methods, which consume streams and handle the events for you.

Working with stream events is useful, when a more controlled way of how the stream is consumed is needed. For instance, triggering an event when a particular stream ends or begins. Have a look at the official Node.js docs regarding Streams for more information on this.

Readable stream events

  • data - emitted when the stream outputs a data chunk.
  • readable - emitted when there is data ready to be read from the stream.
  • end - emitted when no more data is available.
  • error - emitted when an error has occurred within the stream, and an error object is passed to the handler. Unhandled stream errors can crash the application.

Writable stream events

  • drain will be emitted, when the writable stream's internal buffer has been cleared and is ready to have more data written into it.
  • finish will be emitted, when all data has been written.
  • error will be emitted when an error occurred while writing data, and an error object is passed to the handler. Unhandled stream errors can crash the application.

TL;DR

  • Streams are an interface for working with streaming data.
  • Stream data is a buffer by default.
  • Streams are memory efficient. They consume only minimal amounts of memory.
  • Streams are time efficient, data is readable as soon as the first chunk arrives.
  • Streams are composable, they can be connected and combined with other streams.
  • All streams are instances of EventEmitter, but listening to stream events is not the correct way of consuming a stream.
  • Listening to stream events is useful, when you want to trigger something when the stream ends or starts.

Thanks for reading and if you have any questions, use the comment function or send me a message @mariokandut.

If you want to know more about Node, have a look at these Node Tutorials.

References (and Big thanks):

HeyNode, Node.js - Streams, MDN - Writable Stream, MDN - Streams

More node articles:

How to create a web server in Node.js

How to dynamically load ESM in CJS

How to convert a CJS module to an ESM

How to create a CJS module

How to stream to an HTTP response

How to handle binary data in Node.js?

How to use streams to ETL data?

How to connect streams with pipeline?

How to handle stream errors?

How to connect streams with pipe?

What Is a Node.js Stream?

Handling Errors in Node (asynchronous)

Handling Errors in Node.js (synchronous)

Introduction to errors in Node.js

Callback to promise-based functions

ETL: Load Data to Destination with Node.js

ETL: Transform Data with Node.js

ETL: Extract Data with Node.js

Event Emitters in Node.js

How to set up SSL locally with Node.js?

How to use async/await in Node.js

What is an API proxy?

How to make an API request in Node.js?

How does the Event Loop work in Node.js

How to wait for multiple Promises?

How to organize Node.js code

Understanding Promises in Node.js

How does the Node.js module system work?

Set up and test a .env file in Node

How to Use Environment Variables in Node

How to clean up node modules?

Restart a Node.js app automatically

How to update a Node dependency - NPM?

What are NPM scripts?

How to uninstall npm packages?

How to install npm packages?

How to create a package.json file?

What Is the Node.js ETL Pipeline?

What is data brokering in Node.js?

How to read and write JSON Files with Node.js?

What is package-lock.json?

How to install Node.js locally with nvm?

How to update Node.js?

How to check unused npm packages?

What is the Node.js fs module?

What is Semantic versioning?

The Basics of Package.json explained

How to patch an NPM dependency

What is NPM audit?

Beginner`s guide to NPM

Getting started with Node.js

Scroll to top ↑