ArkType: Ergonomic TS validator 100x faster than Zod

192 points by nathan_phoenix 3 months ago

ArkType is a really interesting library that has a difficult time marketing itself. More than being a schema validator, it brings TS types into the runtime, so you can programmatically work with types as data with (near?) full fidelity.

I've been evaluating schema libraries for a better-than-Zod source of truth, and ArkType is where I've been focused. Zod v4 just entered beta[1], and it improves many of my problems with it. For such a mature library to improve like this, v4 is treat and speaks volumes to the quality of engineering. But ArkType has a much larger scope, and feels to me more like a data modeling language than a library. Something I definitely want as a dev!

The main downside I see is that its runtime code size footprint is much larger than Zod. For some frontends this may be acceptable, but it's a real cost that isn't wise to pay in many cases. The good news is with precompilation[2] I think ArkType will come into its own and look more like a language with a compiler, and be suitable for lightweight frontends too.

[1] https://v4.zod.dev/v4

[2] https://github.com/arktypeio/arktype/issues/810

CharlieDigital 3 months ago

I recently went down this same rabbit hole for backend and stumbled on Typia[0] and Nestia[1] from the same developer. The DX with this is fantastic, especially when combined with Kysely[2] because now it's pure TypeScript end-to-end (no runtime schema artifacts and validations get AOT inlined).
I was so shocked by how good this is that I ended up writing up a small deck (haven't had time to write this into a doc yet): https://docs.google.com/presentation/d/1fToIKvR7dyvQS1AAtp4Y...
Shockingly good (for backend)
[0] Typia: https://typia.io/
[1] Nestia: https://nestia.io/
[2] https://kysely.dev/
- retropragma 3 months ago
  
  An interesting development with Typia is that it will need to be rewritten in Go to work with TypeScript 7. https://github.com/samchon/typia/issues/1534#issuecomment-27...
  This is because it relies on patching the TypeScript implementation. I'm curious if its approach is even feasible with Go?
  
  snafferty 3 months ago
  
  Between this and node adding the --experimental-strip-types option which would otherwise allow people to skip compilation, I'm not sure I would choose Typia right now. I'm sure it's a great library, but these don't bode well for its future.
  
  CharlieDigital 3 months ago
  
  I think it's fair to be skeptical, but I'm aligned with the overall approach the author took and I think the approach itself is what is interesting (pure TS + AOT).
  Author + contributors and ts-patch team[0] seem up for a rewrite in Go based on that thread! Might be bumpy, but a pure TS approach is really appealing. I'm rooting for them :)
  [0] https://github.com/nonara/ts-patch/issues/181#issuecomment-2...
- udbhavs 3 months ago
  
  I was going to ask about how pure types would fill the gap for other validations in Zod like number min/max ranges, but seeing the tags feature use intersection types for that is really neat. I tried assigning a `string & tags.MinLength<4>` to a `string & tags.MinLength<2>` and it's interesting that it threw an error saying they were incompatible.
  
  retropragma 3 months ago
  
  That's because "minimum length" cannot be enforced in TypeScript. Maybe you already know this.
  I'm not a Typia user myself, but my RPC framework has the same feature, and the MinLength issue you mentioned doesn't crop up if you only use the type tags at the client-server boundary, which is enough in my experience.
- surprisedcat 3 months ago
  
  Thanks for sharing the deck! I had no idea Typia existed and it looks absolutely amazing. I guess I'll be trying it out this weekend or next :)
  
  CharlieDigital 3 months ago
  
  The docs have a bit of a rough edge because the author is Korean, but the examples are quite good and took me maybe 2-3 hours to work through.
  Once everything clicked (quite shortly in), I was a bit blown away by everything "just working" as pure TypeScript; I can only describe the DX as "smooth" compared to Zod because now it's TypeScript.
sync 3 months ago

Definitely check out Valibot as well, it may be the smaller footprint zod you’re looking for: https://valibot.dev
- Jaydenaus 3 months ago
  
  There's also zod mini now too https://v4.zod.dev/packages/mini
epolanski 3 months ago

> it brings TS types into the runtime
So...it's a parser. Like Zod or effect schema.
https://effect.website/docs/schema/introduction/
- MrJohz 3 months ago
  
  No, it's more like a type reflection system, at least as I understand it. You can use it to parse types, but you can also do a lot more than that.
  
  vosper 3 months ago
  
  Could you give an example or two of “more than that”?
  
  mintplant 3 months ago
  
  Yeah, you can walk the AST of your types at runtime and do arbitrary things with it. For example, we're using ArkTypes as our single source of truth for our data and deriving database schemas from them.
  This becomes very nice because ArkType's data model is close to an enriched version of TypeScript's own data model. So it's like having your TypeScript types introspectable and transformable at runtime.
  
  retropragma 3 months ago
  
  TypeBox is similar by virtue of its goal of its runtime types matching JSON Schema's data model without need for conversion.
  
  epolanski 3 months ago
  
  You can do whatever you want with the AST in effect schema too, it's a parser with a decoder/encoder architecture:
  https://effect.website/docs/schema/transformations/
  
  vosper 3 months ago
  
  That's neat, thanks!
worble 3 months ago

> The main downside I see is that its runtime code size footprint is much larger than Zod.
Yes, it unfortunately really does bloat your bundle a lot, which is a big reason I personally chose to go with Valibot instead (it also helps that it's a lot closer to zods API so it's easier to pickup).
Thanks for linking that issue, I'll definitely revisit it if they can get the size down.
- notpushkin 3 months ago
  
  Personally, I find Zod’s API extremely intimidating. Anything more resembling TypeScript is way better. ArkType is neat, but ideally we’d have something like:
  export reflect type User = { id: number; username: string; // ... };
  Edit: just remembered about this one: https://github.com/GoogleFeud/ts-runtime-checks
  
  crabmusket 3 months ago
  
  This is why I like libraries like typia or typebox-codegen; I'd prefer to write TypeScript and generate the validation, rather than write a DSL.
  
  ChocolateGod 3 months ago
  
  It's not perfect and doesn't cover all of zods functionality (iirc coercion) but I've used https://www.npmjs.com/package/ts-to-zod before to generate zod schemas directly from types.

Aeolun 3 months ago

ArkType looks anything but ergonomic to me. Typescript in strings? Double level type encoding?

It’s a miracle it can be 100x faster than Zod, but speed was never my issue with zod to begin with.

thiht 3 months ago

I have to agree, I know not all people like the same things, but this looks terrible to me. Maybe it’s like Tailwind in the sense that you learn to like it when you actually use it.
The thing is Zod seems fairly standard in the ecosystem, and I value that more than novelty.
9dev 3 months ago

Yeah, that type constraints in strings thing also threw me off big time. Zod is, if you learn it, fairly composable and roughly feels like TypeScript types in terms of intuition on capabilities. Im not sure if I want to sacrifice all of that for a vague promise of performance improvements that I realistically never encounter.
davnicwil 3 months ago

I've not tried it yet so am reserving judgement, but it doesn't look that bad at all.
Heads up, seems overall more scannable than an equivalent zod schema though given the similarity to 'raw' TS.
Also it seems like a fairly short hop to this engine being used with actual raw TS types in a compilation step or prisma-style codegen?

alganet 3 months ago

In validation, it's never about speed. Is how you relate the schema tree to the error reporting tree. If you didn't already, you will figure that out eventually.

If you mess that (either by being too flat, too customizeable or too limited), library users will start coming up with their own wrappers around it, which will make your stuff slower and your role as a maintainer hell.

(source: 15 years intermittently maintaining a similar project).

There is an obvious need for a validation library nowadays that bridges oop, functional and natural languages. Its value, if adopted as a standard, would be immense. The floor is still lava though, can't make it work in today's software culture.

e1g 3 months ago

Especially when unions are involved. We have a flexible schema for many parameters (e.g., X accepts an object or an array of them, and one object has some properties that could be an enum or a detailed shape), and both Zod and Valibot produce incomprehensible and useless error messages that don’t explain what’s wrong. We had to roll our own.
- alganet 3 months ago
  
  Have you tried applying boolean algebra conversions on the schema? Some ORs can be written as NOT AND expressions. I don't know about these JS versions, but in my lib the choice of boolean expression had influence on th3 error output.
retropragma 3 months ago

I find your last paragraph to be lacking. Would help if you elaborated more on the purported need for this "bridge" concept.
- alganet 3 months ago
  
  Maybe one day I will elaborate on it!
  The need is obvious. As natural language becomes more proeminent in programming, there will be a need for a bridge to the older traditional paradigms. I can't give more details, it's the kind of thing you can't put in prose yet.

brap 3 months ago

Really conflicted with TS. On one hand it’s so impressive that a type system can do these sort of tricks. On the other hand if we had type introspection at runtime we wouldn’t need any of this.

9dev 3 months ago

What good would that do with making sure a complex form has been filled out completely? Without a way to meaningfully attach error messages and so on, that only solves a small subset of the problems libraries like ArkType, Valibot, and Zod solve.
- brap 3 months ago
  
  Look at how it’s done in Python, with Pydantic for example. The runtime information allows you to write plain typed functions (for example in FastAPI), and you can be 100% confident these are the types your function will actually receive at runtime, without having to write the types in a separate DSL. This is not possible in standard TS.
  
  9dev 3 months ago
  
  Well look at how the Python ecosystem squirmed real Pythonesque when adopting these type hints; and that's a packaged runtime that can just bump the major version and do a hard breaking change.
  That doesn't work for JavaScript; you need full backwards compatibility at all times. Porting runtime type information to JS would change the language innards so much, it'd just be a different language in the end. At that point we could equally argue whether browser vendors should ship a Python runtime in browsers and add support for <script type="application/python">.

Chyzwar 3 months ago

V4 of zod landed recently and it promises better perf https://v4.zod.dev/v4

epolanski 3 months ago

I really want to see the people that have performance issues with Zod and what's their use case.
I mean it.
I've been parsing (not just validating) runtime values from a decade (io-ts, Zod, effect/schema, t-comb, etc) and I find the performance penalty irrelevant in virtually any project, either FE or BE.
Seriously, people will fill their website with Google tracking crap, 20000 libraries, react crap for a simple crud, and then complain about ms differences in parsing?
- andrewingram 3 months ago
  
  We use it heavily for backend code, and it is a bit of a hot path for our use cases. However the biggest issue is how big the types are by default. I had a 500 line schema file that compiled into a 800,000 line .d.ts file — occupying a huge proportion of our overall typechecking time.
  
  shortcord 3 months ago
  
  That sounds absolutely absurd.
  Are you using a lot of deeply nested objects + unions/intersections?
  
  andrewingram 3 months ago
  
  A fair number of unions, yeah. Which also means some of the tricks for keeping the types small don’t work —- ie taking advantage of interface reuse.
- Escapado 3 months ago
  
  Yup, maintained an e-commerce site where the products were coming from a third party api and the products often had 200+ properties and we often needed certain combinations of them to be present to display them. We created schemas for all of them and also had to transform the data quite a bit and used union types extensively, so when displaying a product list with hundreds of these products, Zod would take some time(400+ ms) for parsing through that. Valibot took about 50ms. And the editor performance was also noticeably worse with Zod, taking up to three seconds for code completion suggestions to pop up or type inference to complete - but truth be told valibot was not significantly better here at the time.
  I agree though, that filling your website with tracking crap is a stupid idea as well.
- gajus 3 months ago
  
  Zod is the default validator for https://github.com/gajus/slonik.
  Zod alone accounts for a significant portion of the CPU time.
  
  BoorishBears 3 months ago
  
  > In the context of the network overhead, validation accounts for a tiny amount of the total execution time.
  > Just to give an idea, in our sample of data, it takes sub 0.1ms to validate 1 row, ~3ms to validate 1,000 and ~25ms to validate 100,000 rows.
- vosper 3 months ago
  
  I’ve used it on the backend to validate and clean up tens of thousands of documents from Elasticsearch queries, and the time spent in Zod was very much noticeable
- anon3459 3 months ago
  
  My issue is ts server perromance not so much runtime
re-thc 3 months ago

> V4 of zod landed recently and it promises better perf
Still far behind if the 100x is to be believed. v4 isn't even a 10x improvement. Nice changes though.

madeofpalk 3 months ago

Since recent typescript features have made it more possible, I’m less interested in runtime validation, and really only keen in build-type schema validation.

There’s a few tools out there that generate code that typescript will prove will validate your schema. That I think is the path forward.

root_axis 3 months ago

Runtime validation is strictly necessary across data boundaries that consume user input.
- madeofpalk 3 months ago
  
  Sorry, I was unclear.
  Using a library like zod requires you to trust that Zod will correctly validate the type. Instead, I much prefer to have schema validation code that typescript proves will work correctly. I want the build-type checks that my runtime validation is correct.
  Typia generates runtime code that typescript can check correctly validates a given schema https://typia.io/docs/validators/assert/ . I've never actually used it, but this is closer to the realm I prefer.
  
  iainmerrick 3 months ago
  
  Using a library like zod requires you to trust that Zod will correctly validate the type.
  Not sure I understand this -- are you assuming there’s an existing schema, either a TS type or maybe something else like JSON Schema, and you’re trying to ensure a separate Zod schema parses it correctly?
  The usual way to use Zod (or Valibot, etc) is to have the Zod schema be the single source of truth; your TS types are derived from it via z.infer<typeof schema>. That way, there’s no need to take anything on trust, it just works.
  
  eyelidlessness 3 months ago
  
  The thing that makes libraries like Zod trustworthy is that they:
  - parse as their validation mechanism
  - compose small units (close to the type system’s own semantics for their runtime equivalents, trivially verifiable), with general composition semantics (also close to the type system), into larger structures which are (almost) tautologically correct.
  Obviously there’s always room for some error, but the approach is about as close to “safe” as you can get without a prover. The most common source of error isn’t mistakes in the underlying implementation but mistakes in communicating/comprehending nuances of the semantics (especially where concepts like optionality and strictness meet concepts like unions and intersections, which are also footguns in the type system).
seniorsassycat 3 months ago

Which typescript features are improving runtime validation?
- madeofpalk 3 months ago
  
  Previously (~2-3 years ago), it was impossible to narrow `unknown` to a fully typed object. Recently-ish, they added the ability for `"foo" in obj` to type-refine `object` to `object & {"foo": unknown}`, which lets you further narrow foo down to something more specific.
dkubb 3 months ago

I don’t know too much about about the TS ecosystem but do these new systems you talk about do it via a “Smart Constructor”? That is the pattern I typically use to solve this problem in Haskell and Rust code.
esafak 3 months ago

Any good article about these features?

chrisweekly 3 months ago

With many TS features making their way into JS, I've sometimes wondered if TS is to JS what Sass is to CSS. I currently rely on TS, but I now consider Saas harmful [there being overlapping syntax w/ vanilla CSS].

iainmerrick 3 months ago

What features are you thinking of? I hadn’t heard anything about TS types making their way into JS in any way (unless you count engines that can execute TS directly, but those just ignore the type syntax). That would be a massive change.
- chrisweekly 3 months ago
  
  https://www.totaltypescript.com/erasable-syntax-only
  https://tc39.es/proposal-type-annotations/
  
  iainmerrick 3 months ago
  
  Hmm, I don’t think either of those warrants any concern about “TS features making their way into JS” or “overlapping syntax”.
  The first one is removing a TS feature that failed to make its way into JS, and the second is about explicitly carving out a space in the syntax so that you can use TS (or Flow!) in JS codebases without being locked into any particular tooling.

domoritz 3 months ago

I like the idea of having types at runtime for parsing etc and generating validators in various languages. What stopped me from going there so far is that I already have TypeScript types provided for the various libraries I use. How good are the tools for importing TypeScript types into ArkType/Zod and working with types in various representations in parallel?

your_fin 3 months ago
The way zod and arktype generally handle this is by providing treating the schema as the source of truth, rather than a type. They then provide a way to define the type in terms of the schema:
```
  // zod 3 syntax
  import { z } from 'zod'

  const RGB = z.schema({
    red: z.number(),
    green: z.number(),
    blue: z.number(),
  })
  type RGB = z.infer<typeof RGB>
  // same thing as:
  // type RGB = { red: number; green: number; blue: number };
```
For the initial migration, there are tools that can automatically convert types into the equivalent schema. A quick search turned up https://transform.tools/typesgcript-to-zodghl, but I've seen others too.
For what it's worth, I have come to prefer this deriving types from parsers approach to the other way around.
- t1amat 3 months ago
  
  With Zod you can build a schema that would match an existing type. Typescript will complain if the schema you build does not match the type you are representing, which is helpful. From memory:
  import { z } from ‘zod’ type Message = { body: string; } const messageSchema: z.Type<Message> = z.object({ body: z.string() })

t1amat 3 months ago

This looks interesting, however Zod has become a standard of sorts and a lot of libraries I use expect, for example, a JSON schema defined as a Zod schema. I would need some sort of adapter to a Zod schema to make this work for me.

cendyne 3 months ago

I'd be very interested in how to replicate parsing like this. Slow typescript inference has stalled my own exploration into things like this.

cugul 3 months ago

Is this at all related to Huawei's ArkTS?

https://developer.huawei.com/consumer/en/doc/harmonyos-guide...

KTibow 3 months ago

No.

tomaskafka 3 months ago

TS: "We have added types to javascript, everything is now strongly typed, and the compiler will pester you to no end until it's happy with the types."

Me: "Awesome, so I get an object from an API, it will be trivial to check at runtime if it's of a given type. Or to have a debug mode that checks each function's inputs to match the declared types. Otherwise the types would be just an empty charade. Right?"

TS: "What?"

Me: "What?"

Realizing this was a true facepalm moment for me. No one ever thought of adding a debug TS mode where it would turn

function myFunc(a: string, b: number) {}

into

function myFunc(a: string, b: number) { assert(typeof a === "string") assert(typeof b === "number") }

to catch all the "somebody fetched a wrong type from the backend" or "someone did some stupid ((a as any) as B) once to silence the compiler and now you can't trust any type assertion about your codebase" problems. Nada.

9dev 3 months ago

That isn’t what TypeScript is for, you’re describing a similar but unrelated problem, namely runtime type validation. There’s a reason why JSON.parse() returns any. However. TS allows to avoid logical errors in the program's source, which is a class of errors historically very important, since JavaScript is so highly dynamic.
The debug mode sounds interesting at first thought, but quickly explodes in complexity when you deal with more complex object types and signatures. To enable automatic runtime validation for all cases, you would need to rewrite programs so thoroughly that you’re pretty much guaranteed to introduce bugs and behaviour changes that were not present in the source code.
In my opinion it’s great that TS draws a strict boundary to avoid runtime impact at all cost, and leave that to libraries like Zod, which handle dealing with external data.
> to catch all the "somebody fetched a wrong type from the backend" or "someone did some stupid ((a as any) as B) once to silence the compiler and now you can't trust any type assertion about your codebase" problems. Nada.
Those type casts are sure annoying, but what’s the alternative? Even in your hypothetical debug mode, you would not be safe here, since you’re effectively telling the compiler you know better and it’s supposed to transform that type, not assert it. Or do you want to remove the escape hatch „as“ is? Because that would be a major pain in the ass in situations where you just do know better, or don’t want to ensure perfect type safety for something you know will work.
You can’t make things idiot proof, no matter how hard you try. That doesn’t make preprocessing type hints useless.
- skybrian 3 months ago
  
  The alternative would be to use a different programming language altogether. In programming languages that have built-in static types, parsing JSON works differently. It’s not possible to lie to the compiler so that a variable declared to be a string actually contains something else. You have to use a parser that will construct the specific type you want.
  The closest thing to JavaScript would probably be Dart, now that it has sound types [1].
  You’re right that it’s a pain in some situations.
  [1] https://dart.dev/language/type-system
- tomaskafka 3 months ago
  
  > "The debug mode sounds interesting at first thought"
  I don't think so. I just added typia's is* checks to places where I am digesting a json input, it was rather trivial, and now I can actually trust for the first time that the object I am holding actually matches the declared type.
  > "You can’t make things idiot proof"
  I just did. You can't have a non-idiot-proof program running in the wild and blindly trusting the outputs of whatever API it uses.
  
  9dev 3 months ago
  
  I doubt you're able to catch all edge cases even in your own program (in an economically viable way), but great for you if that works for that use case.
  However, if we're talking about the TypeScript compiler, the complexity required to ensure end-to-end runtime type soundness is orders of magnitude greater than sprinkling a bunch of isString checks here and there.