meain/blog

Dec 11, 2022 . 11 min

What is in a modern code editor?

What do I know? #

Before we begin, let me give you an idea of where I am coming from and what my experiences are based on. Just to get it out of the way, I'm currently an Emacs user. That said, I have used a lot of text/code editors, all the way from Notepad to JetBrains IDEs(well, in the past) or even the Borland C++ editor which looked like this running on DOS.

Screenshot of Turbo Borland editor

I still remember the day I learnt that you can type sop and hit ctrl + space in Netbeans to complete to System.out.println. This was back in my school days. I also vividly remember the Code::Blocks splash screen to this day. It was definitely a thing of beauty. Well, enough reminiscing on the past and the fact that I am getting old.

The code editors that I have spent most time with however are Sublime Text, Neovim, Emacs and VS Code. That said, I have used quite a few other editors as well over the years out of which, the ones that I thought were interesting are zed (not the new one, I haven't got my hands on that), kakoune, helix, xi. I kinda also like watching bisqwit use That editor or Casey use 4coder.

In this post, the idea is to convey what I think a modern text editor should be capable of doing and what things are generally used to power them. Now, without further adieu, let's get into the meat(or something else if you are vegan) of the blog.

Syntax highlighting #

While not everyone needs syntax highlighting, most people do. For the best part of text editor history, syntax highlighting was powered by a lot of hacky regexes. One significant improvement came with TextMate which introduced TextMate Grammar which is a more organized way of writing something to syntax highlight code but still relied on the old hacky system. But these days we have something much better. We have Tree-sitter which unlike using regexes actually parses your code and forms a syntax tree which you can query on top to figure out how to highlight your code. Tree-sitter is useful for a lot more than just syntax highlighting, but it makes syntax highlighting a lot more efficient. Here is a great talk by the original author introducing tree-sitter.

For those interested, I gave a talk at EmacsConf this year about Tree-sitter.

While this is an area which has mostly been figured out, we are still debating and coming up with interesting ideas around syntax highlighting like semantic highlighting which lets you highlight code based on semantics in addition to just syntax.

Building My Own Clojure Tools - Nikita Prokopov (tonsky) is a pretty good talk which discussion on syntax highlighting along with other things like font and indentation. There is also an interesting talk by Damian Conway which gave me the idea of using backgrounds for showing errors. I also want to introduce you to some works by Nicolas P. Rougier around Emacs. Talk: On design of text editors by Nicolas P. Rougier and Paper: On design of text editors by Nicolas P. Rougier.

Here is how my Emacs looks like. I personally use a minimal theme highlighting just the function names in declarations, comments and strings. For those interested, you can find my Emacs theme hima in my dotfiles.

Screenshot of my Emacs session editing go code

Browsing and picking files #

File pickers, come in various forms. You have your classic side pane, side pane with tree, fuzzy finding, inline tree. At the very least, most editors comes with some way to browse files, for example dired in Emacs and netrw in vim or the file tree side panel in VS Code.

While file browsers are useful, I find some way to select a file using some kind of string matching to be a much better option. Most popular editors also do support this with Sublime Text popularizing this idea IIRC. In rudimentary versions, these are done using string matching on all the files available in a project, of one level up is fuzzy matching.

Again, Emacs does this much better using the orderless package which provides one with insanely powerful filtering abilities. In the following video you can see that I first narrow it down to just go files using the regex .go$, then search for utils, then filter out things which have the word test in them using !test and finally look for patters which have three words starting with p using ,ppp.

Another benefit of Emacs is that you can edit the dired buffers as if it was just text with the full power of macros or any other elisp functions and they will reflect in the filesystem.

Now to give some credit to folks at Neovim, they have Telescope which gives you a live preview of the buffer. Emacs lets you do that too, but yeah. I used to previously do this by integrating fzf but the colors did not match the colorscheme of the editor.

Another neat tool in this area is broot, which lets you fuzzy find with the context of folder hierarchy as a tree.

Searching through the codebase #

Back in the good old days, we just had grep, then came ack, then ag and now everyone seems to have settled on to rg. These are all cli tools which lets you do grep like things, but most good editors have some way to do these kinds of searches from within them and/or integrate these tools into their editors.

In most cases when you are staring off with a new codebase or just wanna find where that one pesky little thing is coming from grep is always your best friend. If I am not mistaken VS Code also uses rg to project wide search. So is the case for Emacs and Neovim though in the latter two you can just swap out the grep engine easily.

A great feature of Emacs is that you can turn the grep results buffer into an editable buffer and edit across all the results from multiple files.

Here is a screenshot of rg.el in Emacs. Screenshot of rg.el in Emacs

Language intelligence #

These are things like intelligent autocompletion, go to definition, find references, rename, refactor etc. It is one of those features that used to be restricted to just big IDEs, but ever since LSP has been a thing, any editor with an lsp client can get a lot of the language intelligence. While I'm not a big fan of Microsoft, you still have to give them credit for introducing the world to the idea LSP though VS Code.

While we used to have ctags or GNU Global it was just the basics jumping around and was at times a bit clunky and not real time in updating the lookup files. But now with lsp, you get most of the important bells and whistles of IDEs with none(only a little) of that bloat.

Beyond just navigating and refactoring source code, lsp servers have evolved into a way for someone to have an editor agnostic way to create servers which will let users do interesting things. One such example would be joe-re/sql-language-server which can connect to your db and show docs, generate completion candidates, run sql queries etc. Having an lsp for bibtex, LaTeX or coq is also pretty useful. You can use lsp to validate and provide completions and docs in your config files using json-language-server and yaml-language-server along with json schema. Having sourcegraph being pluggable as an lsp server was also pretty interesting. There is also efm-langserver which lets you create language servers from binaries like linters.

Here is a gif of lsp providing autocompletions in Sublime Text. GIF of lsp providing autocompletions in Sublime Text

Kinda unrelated, but Raph Levien had some interesting ideas early on when working on Xi on how external tools should interact with an editor. You can see the recording here.

Debugging capabilities #

Talking more about the things that only IDEs(well, Emacs too) used to have, next is debugging. Debug Adapter Protocol (DAP) is another thing which you have to give Microsoft credit for. DAP helps bring the ability to debug code in your editor using a server started by language specific tool which you can connect to and work with.

This is probably something that I have taken least advantage of from all the things that Emacs can do. I still do most of my debugging outside the editor in a terminal. I started my debugging journey with gdb, then used a lot of ipdb or pudb and nowadays a lot of dlv. I am trying to fix this problem of mine where I don't use the editor for debugging by building an Emacs plugin to do debugging using dap(there is already dap-mode) called dap-dance which is still unreleased and might stay that way forever.

I really found this interaction between Lex Fridman and John Carmack around debuggers pretty interesting.

Below is a screenshot of nvim-dap-ui. You can see that it has most of the kind of things that you need from a debugger. Your watch window, backtrace, locals; it's all there.

Screenshot of nvim-dap-ui

Ability to run linters and show errors #

Linters are one of those things which are close to my heart. Ever since I realized I can use linters to look into my code to find dumb shit I do, I have been a fan of them.

I am from that era where you had to save a file, then wait for vim to run linters on your code before you could do anything. Kids these days have it so easy, they have linters continuously running while they edit. I remember when ale came out for vim being so excited that I don't have to wait after every save to get my lint errors popping up in vim. Of course, Emacs already had similar things at that point but I was an Vim person back then.

Most editors these days have the ability to run linters on code in the current active buffer even before you save it. It can take the unsaved source as it is in the editor, pipe it to the linter and get the results back and render them in the text editor as well as provide ways to navigate between different errors. Most of them end up drawing those squiggly red underlines where the error occurred and you can go to them and see what the error is.

Here is a screenshot of Flycheck in Emacs showing lint messages:

Image of Flycheck in Emacs showing off its features

It is not just code, while writing prose you can have your editor integrate spelling and grammar checking tools as linters and they can vet your prose.

Formatting code #

Next one on my favorites list after linters is auto formatters. The only reason this is second is because I might be able to make a rudimentary formatter on my own, but need someone else to make a linter. Having an auto formatter is a godsend in most cases.

You write the code, they take care of making it look pretty or more importantly consistent with everyone else's code. I believe Go was the first language to go all in on auto formatting, but most other languages followed suit. No more arguing and worrying about where to put that { or if we should put a space before +. Right after gofmt became popular, we had prettier for javascript, black for python, rustfmt for rust and a lot of others.

We even have formatters for sql and prettier has a formatter for markdown as well which can nicely format markdown tables. Being able to format generated html, css or json with just one keystroke is really useful.

Here is a really interesting discussion about go fmt with the core go team. I don't know if it is in this video or somewhere else, but I remember a statement from Rob Pike which goes along the lines of "Nobody likes the way gofmt formats their code, but everyone like that it formats their code". I personally found it pretty relatable. You might not like the way the formatter formats the code, but there is a lot of advantages that you get from everyone's code looking the same. It is a lot more easy to parse things when you are reading code plus you start to get this gut feeling that something doesn't look right just from the look of it.

These days, most languages have autoformatters and any good editor should have the ability to use them to format the users code. Most of them also have a setting which you can enable to automatically format your code before you save them which could come in handy if everyone in your team has already agreed to using a specific formatter.

Terminal emulator #

While you can hack around an connect your terminal emulator of your choice to talk to your editor, it is better to have some way of emulating a terminal in your editor. Emacs is so good at it that it has 6 different options, eshell, term, shell, ansi-term, vterm and eat.

You will always run into cases where you would like to run commands while working on code. You can, most of the time get away with external terminals but having it as a component of your text editor makes it much easier to integrate your code and terminal. For example navigating to a error in the output you got in a compilation using the line and column number mentioned is much easier if the terminal emulator is part of your editor. When I used to vim/neovim, I was mostly getting away with using it in tmux and writing integrations between tmux and vim to get similar things done. These days, neovim and vim has built in terminal emulators as well.

Here is how integrated terminal looks like in Codium.

That said, you don't always need a terminal emulator. Kakoune can get away without one as they expect you to be either running within a tiling window manager or tmux context and expect you to have some other way to using the shell.

Git/VCS integration #

Yet another things which you can get away without integrating directly into the editor, but really helpful if you do have it. I am not just talking about the ability to view diff and commit code, but instead things like indicators to show that you edited something, helping resolve merge conflicts or viewing who and how lines changed or even interacting with forges(RIP Atom). For those who have not heard about it, Magit in Emacs is just freaking bonkers. You can see a video about it here.

Here is a screenshot of Magit in Emacs.

Configuration language and plugin system #

This is probably one of the most important or one of the least important things based on who you ask. For me personally, I don't think I'll be using a text editor that cannot be configured to my liking. I am not talking about having a config file where you can edit keybindings, but rather a system which will let me tweak the behavior of the system by writing code(ideally while it is running). This lets you make the editor your own, make it work the way you want instead of adapting to how it wants you to work.

In this regards, Emacs is probably the best you can get as of now IMO. Emacs is built mostly using elisp(except for a C base to build elisp and some core perf critical pieces). This gives you a lot of configurability and this knowledge can apply to working on core emacs as well. If you didn't know even a simple action of inserting a character in a buffer is an elisp function which you can edit just like any other thing.

Now for a few more entries #

Here are some less newsworthy but still important ones.

And even more things #

Now we are getting into my personal opinions(as if the rest was just objective truth).

And I'll stop here. I started this blog thinking I would talk about just lsp, dap and tree-sitter but half way pivoted to writing about all the things that I think should be in a modern code editor. Also, if at any point, it sounded like Emacs solves all your problems, it is because it does. Just kidding; use whatever you are comfortable with. Everything has its problems, but don't be afraid to try other things.

There are some interesting discussions over on HackerNews and Lobsters in case you want to check them out.

← Home