Parsing Beyond JSON with swift-parsing

Published: September 23, 2024
Written by:
Alejandro Martinez
Alejandro Martinez

In today’s development landscape, parsing is often associated primarily with handling JSON from server API calls. We frequently rely on high-level frameworks like Codable to ingest JSON and convert it into our data models. However, parsing encompasses much more than that—it touches almost every aspect of development. By viewing parsing through a new lens and adding a proper tool to your toolbelt, you can tackle problems with a fresh perspective.

What is parsing?

Consider the typical example of parsing a JSON response. We can see how parsing takes the JSON string, reads it, understands the special characters and format that make up the JSON specification, and outputs data on the other side.

Usually, this data consists of basic JSON objects, arrays, and values. But Swift developers often prefer working with strong types, which is why it’s common to take an additional step to process JSON output into custom Swift types using the Decodable protocol. This demonstrates how we go from an HTTP response blob of data, which we read as a string, to Swift types. This is what parsing is: transforming unstructured data into structured data.

Once you understand parsing as this transformation process, you start to realize that many things fit this definition. Examples include transforming a hex color string into a proper Color typeparsing strings as dates (what Foundation calls Date Formatters), understanding the HTTP request structure, and parsing languages like arithmetic operations or even a programming language like Swift.

There is extensive literature on parsing, from recursive descent parsers to language specifications that automatically generate parsers. Rolling a handmade parser for a specific domain is an exercise that every developer should try at least once in their career. For example, implementing a language interpreter can be a very valuable learning experience. However, after writing a few parsers, you realize that there are some things you always need. At that point, even if you want to write things yourself, you may want to reach for a library that helps you get things going faster.

Meet swift-parsing

Swift Parsing by Point-Free is a package for parsing that focuses on composability and generality without sacrificing performance. You can construct small parsers that focus on one part of the problem, and then use higher-order parsers to compose them together. This approach, combined with the set of basic focused parsers that come with the library, allows for parsing any data in record time, making it a powerful tool for tackling a wide range of parsing challenges.

This library is especially useful for developers working on tooling, even if it’s just fixing or improving automation pipelines. You can see how easily one can create a parser to ingest Xcode logs. A good example is when wanting to add additional metadata to a Package.swift file, but the options provided by SPM are not sufficient. Parsing to the rescue!

Extending a Package.swift file

Imagine a basic Package.swift file that has one dependency:

let package = Package(
	name: "...",
	products: [
		.executable(name: "...", targets: ["..."]),
	],
	dependencies: [
		.package(url: "https://github.com/apple/swift-argument-parser", exact: "1.5.0"),
	],
	targets: [
		.executableTarget(
			name: "...",
			dependencies: [ ... ]
		),
	]
)

SPM provides a subcommand to dump the structure of your Package as a JSON representation.

SPM JSON dump
SPM JSON dump

But this only works for the information that SPM is aware of. Imagine you want to improve your tooling and want to add more information to each dependency your package relies on. For example, you might want to add an owner field to clarify which team member is responsible for keeping the dependencies up to date.

One approach could be to define a new function as an extension of the dependency type. Thanks to Swift extensions and SPM’s ability to execute Swift code, you can implement this solution quite easily.

// Under packageDependencies:
[
	.package(
		url: "https://github.com/apple/swift-argument-parser",
		exact: "1.5.0",
		owner: "Martin" // Added by the extension below
	)
]

// Somewhere else...
extension PackageDescription.Package.Dependency {
	static func package(
		url: String,
		exact version: Version,
		owner: String
	) -> PackageDescription.Package.Dependency {
		.package(url: url, exact: version)
	}
}

That works, and SPM won’t notice any difference, so your entire workflow remains unaffected. However, SPM won’t include the metadata we added in the JSON output because it doesn’t know about it. And yet, we added that metadata for a reason. At this point, it may seem like what we’ve done has been for nothing, but…

What if you could extract that metadata yourself? The good news: you can! By parsing the Package.swift file and pulling out the data we need. That may sound daunting, but thanks to swift-parsing, it’s surprisingly straightforward.

Note: Parsing a language as complex as Swift is not trivial, and projects like SwiftSyntax vividly demonstrate this. When faced with these problems, it’s often better to focus on solving exactly what’s needed, instead of aiming for the most general solution. That’s the approach this article will take.

Setting up the Project

To get started, you can download the starter project:

It’s a basic SPM executable package that depends on swift-parsing and swift-argument-parser. It includes the start a command that reads a file and prints the contents to the console. You will also find a ExamplePackage.swift which is the target file we will be parsing.

You should be able to run the project with swift run example ExamplePackage.swift.

Creating a Parser

Let’s start by creating our parser. You’ll notice that parsers are just instances of a type, rather than types conforming to a protocol. While protocol conformance is possible, it’s often reserved for more complex and higher-order parsers. For most of your needs, creating an instance with Parse and the result builders closure is sufficient.

let dependencyLine = Parse {
	PrefixUpTo(".package")
}

The first step is to parse everything up to a line defining a dependency. PrefixUpTo is one of the parsers that comes with the library, and it consumes everything until it finds the given string. An important note is that it stops just before that string, which means our “parsing cursor” is positioned right before the dot character . in .package.

To see this in practice, it’s useful to set up some debug printing very early in the process. It can sometimes be tricky to wrap your head around what’s happening, so having a way to visualize the result is crucial.

var input = packageContents[...] // loaded String from the file
let parsed = try dependencyLine.parse(&input)
print("parsed>>(parsed)<<")
print("rest>>(input)<<")

The string loaded from the file is converted to a Substring using [...] and stored in a variable. This allows it to be passed to the parser as inout, letting the parser consume the input as it goes along.

If we run this as is, we can see how parsed contains all the content of the file up to, but not including, “.package”.

Example output
Example output

Skipping Unnecessary Data

Right now, the PrefixUpTo parser is taking the result from the parsing, but we don’t actually need that part of the file, so it should be discarded. This introduces the concept of skipping, which simply requires wrapping any parser with Skip.

Skip {
	PrefixUpTo(".package")
}

As soon as we make this change, we’ll get a warning because now the result of parsing is Void, since we’re skipping the only parsing operation we have. You can clearly see this if you add a type to the variable:

let dependencyLine: some Parser<Substring, Void>

A parser has two associated types: the input and the output. Usually, we don’t have to declare this as Swift infers it from the result builder closure of Parse, but sometimes it’s useful to keep it in mind.

Now that we’re at the line we want to be, let’s move forward and skip everything until we find the URL of the dependency.

let dependencyLine: some Parser<Substring, Void> = Parse {
	Skip {
		PrefixUpTo(".package")
	}
	Skip {
		".package(url:"
		Whitespace.init()
		#"""#
	}
}

To ensure our approach is correct, we can run the code and observe how the output shows that the start of the pending substring is exactly where we want it:

parsed>>()<<
rest>>https://github.com....

Above, we’ve used two new parsers. The string literal acts as a parser as the library makes the String type conform to the parser protocol. Whitespace, as its name implies, is a parser that consumes whitespace; it has several useful options, such as handling newlines.

Parsing the URL

Enough skipping. Now it’s time to actually parse useful data that we want. For this, we need to consider how to approach it. In this case, it should be enough to parse the input until we find the closing ". We can employ Prefix(while:) for this, which will parse until it finds a match:

Prefix(while: { $0 != #"""#})

Note: Using Raw Swift Strings makes it easier to write the matching string you want to find. By using #" and "#, you can write " inside the string and will be treated as that literal character, without the need to escape it. Using #"""# is equivalent to "\"".

If you add this line, the compiler will complain as the type of the parser is no longer Void.

// ERROR: Return type of let 'dependencyLine' requires the types 'Substring' and '()' be equivalent
let dependencyLine: some Parser<Substring, Void> = Parse {
	Skip {
		PrefixUpTo(".package")
	}
	Skip {
		".package(url:"
		Whitespace.init()
		#"""#
	}
	Prefix(while: { $0 != #"""#})
}

You might be surprised to find that the output is now Substring. While this approach works, in many cases, it’s often preferable to make the output a String since it’s easier to work with. Thankfully, parsers have a map operation that allows transforming them into a new type.

Prefix(while: { $0 != #"""#}).map(String.init)

Now we can add this new line in our parser and correct the type:

let dependencyLine: some Parser<Substring, String> = Parse {
	Skip {
		PrefixUpTo(".package")
	}
	Skip {
		".package(url:"
		Whitespace.init()
		"""
	}
	Prefix(while: { $0 != #"""#}).map(String.init)
}
// prints
parsed>>https://github.com/apple/swift-argument-parser<<
rest>>", exa...

And we have our URL! It’s worth noting that we could map it to the URL type to get a proper URL in the output. However, this would return an optional URL, as that’s what the initializer produces. For this exercise, a String is used since the URL isn’t the focus of the demonstration.

Parsing the Version

You already know what’s next. We need to skip everything until the next parameter.

Skip {
	#"", exact: ""#
}

There isn’t much new to say here since we’re using the same approach as before. However, I’d still recommend running the code and verifying that the pending string is where you expect it to be: rest>>0.13.0....

**Note:** that SPM has other parameters to define the version of a dependency like `from:` or `branch:`. Using a `OneOf` to accommodate for those variants is left as an exercise for the reader.

Now we want to parse the version. For this, we can use the same parser we used earlier: Prefix(while: { $0 != #"""#}).map(String.init). Adding this will force us to change the type again. Of course, you can simply remove the type annotation and let the compiler infer it; we’re just keeping it for illustration purposes.

Repeating the same parser multiple times (and yes, we’ll have to use it a third time) isn’t elegant. But thanks to how swift-parsing is designed, it’s very easy to extract a parser. You just need to move it into its own variable.

let parameter: some Parser<Substring, String> = Parse {
	Prefix(while: { $0 != #"""#}).map(String.init)
}

Just by extracting the parser into its own variable, we now have a new tool in our parsing toolbelt, and we can use it like any other parser, as if it came directly from the library itself.

Parsing the Owner

With the new parameter parser in hand, we can easily write the next parsing step by adding another Skip and composing the new parser.

let dependencyLine: some Parser<Substring, (String, String, String)> = Parse {
	Skip {
		PrefixUpTo(".package")
	}
	Skip {
		".package(url:"
		Whitespace.init()
		#"""#
	}
	parameter
	Skip {
		#"", exact: ""#
	}
	parameter
	Skip {
		#", owner: ""#
	}
	parameter
}
// prints
parsed>>("https://github.com/apple/swift-argument-parser", "1.5.0", "Martin")<<
rest>>"),...

Introducing Custom Types

With this, we’ve successfully parsed the line of dependency. However, you may have noticed that the output is currently a tuple of strings. While this is useful during the development of the parser, now that our dependencyLine parser is complete, it’s time to upgrade the output to a proper type.

struct Dependency {
	let url: String
	let version: String
	let owner: String
}

let dependencyLine: some Parser<Substring, Dependency> = Parse(Dependency.init) { ...

By simply passing the .init function to the Parse initializer, the library will map the output of the parser to our own type. This is a significant improvement over having tuples scattered throughout the code.

Parsing Many Lines

Now that we have a parser for a single dependency, we need to handle multiple lines that declare dependencies. While we’ve been integrating parsers into our main one so far, we can now keep our line parser as is and use it to parse multiple lines via composition. Fortunately, swift-parsing provides the ideal tool for this task: Many.

Before we proceed, we need to refine our dependencyLine parser. The current implementation of this parser is supposed to consume only one line of a dependency, but it actually consumes the beginning of the file until it finds the first dependency. It then stops right after consuming the owner part of the dependency line, leaving the rest of the line untouched.

This is problematic because if we want to use this parser repeatedly, the parser needs to start and end in a predictable manner so that each iteration behaves correctly. Tweaking the single parser so it composes correctly with Many can be challenging, and might require several iterations to perfect.

let dependencyLine = Parse(Dependency.init) {
	// note the removal of the first Skip
	Skip {
		".package(url:"
		Whitespace.init()
		#"""#
	}
	parameter
	Skip {
		#"", exact: ""#
	}
	parameter
	Skip {
		#"", owner: ""#
	}
	parameter
	Skip { // added so we parse until the end of the line
		#"")"#
		Optionally { "," }
	}
}

Another parser I’m using here is Optionally. As the name suggests, it attempts to parse the given input— in this case, a comma—but if it doesn’t find one, it won’t fail. Instead, it simply ignores the issue and continues parsing. I’m using this to accommodate lines that may or may not end with a comma, since that’s what Swift allows.

let dependencies = Parse {
	Skip { // moved from the previous parser
		PrefixUpTo(".package")
	}
	Many {
		dependencyLine
	} separator: {
		OneOf {
			"\n"
			",\n"
		}
		PrefixUpTo(".package")
	} terminator: {
		Whitespace()
	  "]"
	}
}

The Many parser accepts a primary parser that runs on each iteration, along with two additional parsers. One of these extra parsers specifies the separation between iterations. In our case, it handles consuming the newline at the end of each line and any content until the next .package definition is found. The third parameter is a parser that determines when the iteration should stop—in our scenario, it consumes any remaining whitespace and the closing square bracket of the array.

With this, we can switch to use the dependencies parser to get an array of dependencies as our parsed output:

let parsed = try dependencies.parse(&input)
print(parsed)
The list of dependencies you worked hard to get!
The list of dependencies you worked hard to get!

Of course, this parser doesn’t cover every possible option that Swift or SPM might support, but it’s a solid starting point that solves the specific problem. Thanks to swift-parsing, it only took a few minutes to build. And with the help of result builder syntax and the composability of the parsers, extending it to support more variations wouldn’t take too much additional effort.

You can check out the final version of this project.

Wrap Up

I highly recommend trying out swift-parsing for yourself, their documentation is excellent. While this post hasn’t covered performance and generality, it’s worth noting that even though we’ve been working with strings, swift-parsing is capable of handling various other types of nebulous data, from Unicode scalars to raw bytes.

What’s truly impressive about swift-parsing is that it’s not just a powerful tool for parsing; it also supports printing. This means that with a well-constructed parser, you can not only convert a string into structured data but also print that structured data back into a string. It’s pretty amazing!

With this new understanding of what parsing entails and with swift-parsing added to your toolkit, I encourage you to revisit the challenges you face with fresh eyes. Parsing isn’t just a niche solution, it’s a versatile approach that might unlock solutions to problems you hadn’t previously thought of. Whether it’s for simplifying complex data manipulation or tackling tasks you once thought cumbersome, parsing could be the key to solving them with elegance and efficiency. So, give it a try and see how it can transform the way you approach your coding problems.

Explore Further

Well done - now you learned how parsing can be useful as a whole concept, way beyond JSON!

We hope this example has illustrated just how straightforward and powerful parsing can be, especially when using such a robust library. You might not have initially considered hacking your tools or exploring parsing as a solution to your problems. Perhaps you even thought that alternative approaches, such as Regular Expressions, might be simpler. But now you might see things differently!

If you’re interested in learning more, Pointfree has a great collection of videos about parsing.

If you have any questions or comments, feel free to ping Alejandro Martinez at Mastodon or X, or Swift Toolkit at Mastodon or X.

See you at the next post. Have a good one!
Swift, Xcode, the Swift Package Manager, iOS, macOS, watchOS and Mac are trademarks of Apple Inc., registered in the U.S. and other countries.

© Swift Toolkit, 2024-2025