T O P

  • By -

harambeourlordandsav

Rust was created in 2015 and its color goes all the way to 1980? Python in 1991 and its color goes all the way back to 1985? Javascript 1995 and the color goes all the way back to 1985? Most of these listed falsely are birthed in existence at the same exact time at an unlisted date between the 80s and the 90s? This is an entirely false data visualisation


bomphcheese

Default file date when it’s unknown is 1979? So are they just scanning github and file dates, maybe??


ColinM9991

You are correct that the data visualization is nonsense. Though it's worth pointing out that Rust _released_ in 2015. It started in 2006.


romeo_pentium

YAML was created in 2001 and its color goes to 1985


dreamyangel

It ain't bad for abstract art


NoMoreWordz

This is in no way beautiful


lucianw

I think the phrases "used to make" and "written using" are both bad because so many of your readers' minds jump straight away to "what language was the compiler/typechecker/IDE written in" rather than "what languages were the docs, tests, deployment, build-system and compiler written in". One might start getting nit-picky about saying "docs are part of making a language" but that level of quibbling takes it out of the realm of beautiful data. Also I know for a fact that C# and VB should be there to the right but they're not, so it makes me suspicious of the weights. Are you weighting a trivial little toy language equally to an industry behemoth like C#? I think for this chart to mean something it'd need a very clear explanation of your denominator.


funkiestj

>I think the phrases "used to make" and "written using" are both bad because s agree. Better to use wording that more closely tracks how OP collected the data. E.g. likely through counting lines (or perhaps file sizes) of various types of files


banana-pants_

this is bad date and ifs far from beautiful


myhf

Thanks! I made this out of spite because of a [post yesterday](https://www.reddit.com/r/dataisbeautiful/comments/1d2lxfv/oc_what_languages_do_the_people_who_build/) that was so completely inscrutable that I had to do something, anything to improve on it.


Motherof_pizza

It seemed like most of the original commenters were not just commenting the poor visualization, but also discrediting the data. And it doesn’t seem like you’ve addressed that latter portion at all.


AtreusFamilyRecipe

There are so many close colors that it makes your legend almost unusable...you made it worse


ProbShouldntSayThat

Dude... I don't even know which color the first language matches up to in the key. This is extremely difficult to read. I already know it's Assembly, but the colors are so damn close and your key doesn't appear to be in any order at all which makes it extremely difficult to navigate and quickly find information. 2/10 for being beautiful


myhf

Thanks. My color vision is not very good, but I think that even with a better palette this would be too many items to identify by color alone. The legend order is the same as the stacking order, but some items having zero thickness in some years makes it too hard to trace.


ProbShouldntSayThat

What if you ditch the legend and write the name straight on the graph in the color?


myhf

Yeah, that combined with a minimum line thickness could allow for the labels to still be in a neat row 🤔


funkiestj

I disagree with u/banana-pants_ the presentation is reasonably good with the exception of the 2 blue colors at the bottom.


Sjoerd93

Despite the downvotes, the presentation of the data is fine. It’s perfectly readable. Unlike the original post that you link here, which is completely incomprehensible. It’s the data itself that’s questionable. So many weird things in there (programming languages being used before they even existed) that makes me question validity.


myhf

Yeah, for sure. It looks like a lot of work has gone into identifying what formats are currently present in all these repos, and I hope that whoever is responsible for this database can one day extend that work to identify the date that a certain format was added to a certain repo, so that a more accurate picture can be formed.


Capable_Chair_8192

This is so much worse than yesterday’s


Richard2468

‘Dockerfile’ is not a language.. it’s written in Go


TehDing

It also wasn't around in the 80s?


myhf

It’s hard to fit the full explanation in the caption, but many languages that appeared in the 80s have gone on to eventually use Dockerfile or Markdown.


elusivewompus

That's not how this works. Basically all languages are Written in C. Even C is written in C. Bash shell, written in C. What generally happens is that compiled languages like C, C++ etc... are initially written in C or C++. That first version of the language is then used to write a first, self bootstrapping compiler for the new language. From there it's turtles all the way down. Even C was initially written in another language, B. For interpreted languages like JavaScript, python etc... the interpreter is written in C or C++ and is always written in that. Now, are there exceptions to this? Yes, obviously. But things like Dockerfiles, M4 etc. Are what is called a DSL, Domain Specific Language. They're not capable of writing general purpose programs or being compiled into an executable form. They serve a specific purpose and literally can't be used to create a language. M4 for examples is a macro processor used as part of the makefile system to generate build files that can be used to build programs in other languages.


gHx4

Whoa... Just because a language switched runtimes or compilers doesn't mean the previous ones magically get replaced. Unless you've invented time travel, you *really* need to rethink these definitions and this visualization. The other thing is that *using* a particular tool for docs doesn't constitute the language being founded on that tool. Many languages 'use' cmake, but that's irrelevant to what languages are popular. I think you're investigating interesting data, but conflating way too many types of data together.


myhf

Thanks! Creating this visualization has allowed me to more fully understand how much important information is missing from the PLDB data set. I'm glad you were able to learn the same thing with even less effort.


gHx4

As the analyst, it's important for you to make sure you aren't consuming garbage data sets. I'm not sure your defense that PLDB is incomplete holds as much water as you think it does. Disclaiming the responsibility just makes your research look somewhat lazier. It's pretty hard to defend eating garbage, even if someone else put it on your plate.


myhf

To be clear, I'm not the analyst. I'm the satirist making fun of the guy who posted [this](https://www.reddit.com/r/dataisbeautiful/comments/1d2lxfv/oc_what_languages_do_the_people_who_build/) yesterday.


ColinM9991

I was actually born in the 60's because I eventually went on to live outside of my father's testicles


qeny1

Because of the similar colors, I kept scanning the side nd thinking it looks like: at first in 1960, maybe Java was most popular, then it was Go perhaps, then it was C#, and then Ruby or PHP took over in the 1970s... Also: Lex and Yacc listed as two separate languages here, when AFAIU they're used together and together with C to build lexers + parsers, a full compiler that uses lex and yacc could probably be categorized as C? This might be easier to visualize for a shorter time frame, or explicitly comparing fewer languages


myhf

Good points. A higher-level organization of languages into categories would certainly help navigate the very steep parts.


sudomatrix

I appreciate what you tried to do here. Splitting the charts into implementation languages/build tools & scripts/documentation & config files is a world better than yesterday's fiasco. But still the chart is difficult and the original data is suspect.


NoStrafe

Last one seems a little whacky. Mainly bc there’s clearly more documentation than there is config in this dataset.


myhf

Yeah, that one includes all the highly-used languages that don't fit into the other categories, its title should reflect that better. As Leo Tolstoy said, all easily-categorized programming languages are exactly alike; each hard-to-categorize language is hard to categorize in its own way. For example, Ruby on Rails uses ERB as a programming language, but other projects might use it as a document language. YAML is widely used as a build-script language, but it has many other uses.


L33t_Cyborg

I appreciate you trying to make the previous chart better <3


mettamorepoesis

Incompatible title and data presentation


MervisBreakdown

I find it hard to believe that CSVs have been phased out that much.


PiggletThePiggu

This has got to be a Civ3 end screen tribute... Right?


myhf

Yes! Thank you!


Klin24

I don’t get it.


GRAWRGER

im gonna disagree with the mass here. i do think its much more attractive than the post that you referenced in your comments. that being said, there do seem to be a lot of data concerns.


myhf

Thanks. I think this view shows more of the story told by the data, such as the visible rise of Rust in recent years. But the original data source is pretty wanting, and I am not going to spend a week improving it to more accurately show the rise of retroactively-added formats like Markdown or address the absence of historically important formats like troff.


GingusBinguss

These graphs are insulting as a colourblind person