Hometus.io logo

tusd v2: better hooks, network resilience and IETF protocol

Published on by Marius Kleidl

Exactly four years after the last major release of tusd, version 1.0.0, it is time for another big announcement: we are happy to say that tusd v2 is finished! The release is ready for production and has been battle tested at scale at Transloadit in the past months. tusd is our server implementation in Go for the tus resumable upload protocol.

This major release focuses on two main aspects: interoperability and resilience. The reworked hook system makes it easier to integrate tusd into your application, while also extending its functionality. Furthermore, we focused on the way tusd handles degraded network connectivity and made numerous changes to improve its resilience in such circumstances. Besides these main aspects, there are additional improvements, which are all described in the following sections.

If you want to try out the new release, you can find the source code and pre-built binaries on GitHub.

Before going into more detail, we want to emphasize that there are no breaking changes in the HTTP interface for tus uploads. Existing tus clients will continue to be able to communicate with tusd, without any changes being necessary. You can replace your tusd v1 deployments with v2, and all clients will continue to function.

Hook system

This major release includes an entirely reworked hook system. Hooks enable bi-directional communication between the tus upload server and your main application. During the lifecycle of an upload, events are emitted that can notify your application whenever an upload is started or finished, or while data is being transmitted. This allows for a tight integration of resumable uploads into your system.

In the previous versions, this hook interface was mainly designed for one-directional communication that flows from tusd to your main application. Over time, we realized that this approach is too limited. Developers want to also send feedback to tusd in order to reject uploads before they are created, send fully customized error messages to the clients, and include additional information in the responses once an upload is complete. tusd v1 did not have proper support for these tasks.

With tusd v2, we have completely reworked the hook system, making it more powerful than before:

To demonstrate its new capabilities, here is an example for an HTTP server written in Python 3, that validates whether the client provided the filename via meta data. If not, a custom error response is sent to the client. If validation passes, the server instructs tusd to use a specific ID for this upload and to store the creation time in the meta data.

from http.server import HTTPServer, BaseHTTPRequestHandler
from io import BytesIO

import json
import time
import uuid

class HTTPHookHandler(BaseHTTPRequestHandler):
    def do_POST(self):
        # Read entire body as JSON object
        content_length = int(self.headers['Content-Length'])
        request_body = self.rfile.read(content_length)
        hook_request = json.loads(request_body)

        # Prepare hook response structure
        hook_response = {
            'HTTPResponse': {
                'Headers': {}
            }
        }

        # Use the pre-create hook to check if a filename has been supplied
        # using metadata. If not, the upload is rejected with a custom HTTP response.
        # In addition, a custom upload ID with a choosable prefix is supplied.
        # Metadata is configured, so that it only retains the filename meta data
        # and the creation time.
        if hook_request['Type'] == 'pre-create':
            metaData = hook_request['Event']['Upload']['MetaData']
            isValid = 'filename' in metaData
            if not isValid:
                hook_response['RejectUpload'] = True
                hook_response['HTTPResponse']['StatusCode'] = 400
                hook_response['HTTPResponse']['Body'] = 'no filename provided'
            else:
                hook_response['ChangeFileInfo'] = {}
                hook_response['ChangeFileInfo']['ID'] = f'prefix-{uuid.uuid4()}'
                hook_response['ChangeFileInfo']['MetaData'] = {
                    'filename': metaData['filename'],
                    'creation_time': time.ctime(),
                }

        # Send the data from the hook response as JSON output
        response_body = json.dumps(hook_response)
        self.send_response(200)
        self.end_headers()
        self.wfile.write(response_body.encode())


httpd = HTTPServer(('localhost', 8000), HTTPHookHandler)
httpd.serve_forever()

Once this hook server has been started, you run tusd and send hooks to it:

$ tusd -hooks-http http://localhost:8000

This only scratches the surface of what is possible now, and we encourage you to check out the additional examples in the repository. It includes code for HTTP and gRPC servers, scripts, and the new plugin system covering a variety of use cases. You can read more about the new hook system in the documentation. If you are already using hooks, upgrading to tusd v2 is simple. The structure of the hook information has changed slightly, but still provides all the data like before.

Network resilience

Tusd must accept large uploads, even over unreliable network connections. It must therefore be resilient to various kinds of network issues and interruptions. For tusd v2, we worked on improvements in this area as well.

A new end-to-end test suite has been developed that simulates various network conditions, such as slow upload speeds, dead connections, and concurrent requests. It allows us to observe tusd’s behavior and asserts its correctness in these situations. This test suite has been central in our latest efforts to ensure that the tusd v2 release is ready for production.

In the past, concurrent upload requests have proven an especially tricky topic. tusd does not support concurrent requests to the same upload resource, as parallel writes could trigger data loss. This rule is enforced through locks, which must be acquired before accessing or modifying an upload. While this solves the problem of concurrent access, it creates another one: when an ongoing upload is interrupted by network issues, the server might be unaware of this interruption. It could then assume that the upload is still continuing and thus hold on to the upload lock. The client, on the other hand, might realize this interruption and abandon the previous upload request to resume the upload. This resumption would fail, because the upload is still locked from the previous request until the connection times out. Depending on the setup, such a timeout can be lengthy, which is likely to cause annoyances for the end users.

As a solution, tusd v2 allows new requests to ask previous requests to the same upload resource to release their upload lock in a controlled and safe manner. This solves the problem of locked uploads during resumption, while also ensuring no data corruption. You can read more on the topic of upload locks in the corresponding documentation.

Support for the new IETF protocol

As we shared in previous blog posts, we are actively working with the HTTP working group of the Internet Engineering Taskforce (IETF) on a new standard for resumable file uploads. With an official standard, support for resumable uploads could be added directly on browsers, proxies, HTTP servers, and mobile platforms. While development is still ongoing and the document remains a draft for now, tusd already offers preliminary support for this new, experimental protocol. It serves as a playground in which you can explore and test out the standard while we are writing it. However, neither the specification nor its implementation are ready for production use yet. More details on the overall journey and the future of tus are available in our previous blog post.

If you want to give it a try, the experimental protocol can be enabled via the -enable-experimental-protocol flag:

$ tusd -enable-experimental-protocol

tusd will then accept traditional tus uploads, while simultaneously supporting uploads via the new protocol at the same endpoint. A list of compatible clients with corresponding examples is also available at github.com/tus/draft-example. If you have any feedback, feel free to reach out to us!

Further improvements

Besides an improved hook system and more resilience, tusd v2 also includes enhancements in the following areas:

Breaking changes

Unfortunately, this release also brings along a few breaking changes. Affected users are required to adjust their tusd installation, even though the necessary changes should be minimal.

That being said, tusd v2 does not introduce a breaking change for its HTTP interface. All current tus clients can be used without modifications with tusd v2. Tusd can be seamlessly upgraded in production from v1 to v2, without changes to the clients.

Future development

With the v2 release out of the door, we are already thinking about the future development of tusd. Our next point of focus will be to add support for the expiration and checksum extensions to tusd, so that all extensions from the specification are implemented. With the expiration extension, tusd will be able to clean up unfinished and/or finished uploads after an expiration period, to ensure unused resources are available again. With the checksum extension, the server can verify the integrity of the uploaded file by comparing it to client-provided checksums and thereby detect any transmission errors. In addition to those two extensions, we also want to make vertical scaling of tusd easier. Right now, running a tusd cluster can be challenging and we want to explore multiple ideas on how to make this easier.

Thank you

Finally, we want to thank all contributors who dedicated their time and efforts towards this release. It would not be possible without you!