Open Source Development Communities For EDA Tools

In order to understand where EDA tools are GOING … in the future, not now which is readily available from Synopsys, Siemens EDA (formerly Mentor Graphics) and Cadence Design Systems or the general Wikipedia page for comparison of free & open source software EDA tools or DARPA/Google’s OpenROAD Project for open-source semiconductor design automation tools … it’s probably a great idea to follow the Top 500 list of supercomputers for the same reason that shade tree mechanics might follow the pit stops at Indy, Formula One or NASCAR.

As a practical point, it’s also kind of important to follow how trends in supercomputing tend to end up in the AI hardware market … through things like AI,the bleeding edge of supercomputing has a way of working its way down to the mobile and even edge devices … and the general philosophies behind supercompute are going to eventually proved to be the most important driver what is needed in EDA tools to produce affordable, high-volume, commodity processing capabilities.

It’s certainly interesting to try to guess what kinds of secrets the fastest of the fastest of the fastest teams are using, but in the LONG RUN though … legitimately OPEN models will always win and gnaw at away at even the best of the best proprietary tech outfits, eg look at Linux v Windows running in containers OR the Python ecosystem v VisualBasic … because tens of millions of engineers can work on a problem versus the staff in the cubefarms at different locations of a single company. No matter how incredibly smart you think that the people on your staff are, the vast majority of the smartest people in the world are necessarily going to be working for someone else. If you have an ELITE level job with the most elite of the elite level companies, you might be tempted to think rather highly of the people you work with … but the reality is that open source development communities are where you are going to find the very smartest and MOST INTELLIGENTLY GENUINELY KIND, most visionary people in the world.

The whole POINT of participating in an open source development community … and PARTICIPATING means genuinely contributing in a way that one forms new friendships and professional relationships within that community … the whole POINT of PARTICIPATING in a development community is about the personal GROWTH that comes out of building something that will GROW and will be generally NEEDED [by everyone using a computer or mobile device or participating in an economy connected to others in the world] for at least the next two or three decades.

Examples of open source development communities is necessarily about the key quasi-famous development leads, eg Richard Stallman or Linus Torvalds but also lots of quasi-anonymous code committers, code reviewers, documentation writers and application users who provided useful feedback to really important open source technology such as the GNU Debugger and GNU Bash, Linux, MariaDB (because open source MySQL was CLOSED by Sun-Oracle), PostGRES, SQLite, Git, Python and the Python ecosystem including PyTorch and TensorFlow, neo4j and GraphQL, Open Container Initiative, K8s and, course, development toolchains and different forms of instrumentation which embed eBPF.

Many of these open source technologies are like EDA tools in that billions of people now DEPEND upon something derived from these tools or technologies … even though most of those people will never realize how what things they rely upon actually depends on that technology … of course, anyone dependentu upon some form of technology that includes an integrated circuit of some type already sort of depends, albeit INDIRECTLY, upon the EDA tools that were used to design and manufacture that integrated circuit product … but the nature of AI and higher performance compute-intensive technologies will make these EDA tools that much more relevant. As we move toward a much more productive sharing economy, eg populated by companies whose business model is based upon optimization of SHARED resources, such as Uber, AirBnB or AWS/GCP/Azure and other cloud compute services, the degree to which resources are shared in a much more hyper-efficient manner will be enabled by the processing capability of smarter systems that are built with the help of EDA tools.

When we look at EDA tools … we sort of immediately start thinking about the underlying, foundational things like mathematical libraries that these sophisicated tools are going need … then we also think about debugging software built on those libraries … which takes us in the direction of tools for tracing, profiling, packet filtering with BPF as well as parts of the development toolchains and infosec instrumentation that embed eBPF like BCC, bpftrace, perf, ply, systemtap, PCP, Weave Scope, Cilium, Suricata, systemd, iproute2, p4c-xdp, the LLVM compiler infrastructure, libbpf, bpftool cilium ebpf-go cilium ebpf_asm and Falco.

The whole POINT of PARTICIPATING in a development community is about the personal GROWTH that comes out of building something that will GROW and will be generally NEEDED [by everyone using a computer or mobile device or participating in an economy connected to others in the world] for at least the next two or three decades.

It’s all about GROWTH … when you think deeply about this, you start to realize that it is primarily about your own growth.

Thinking about GROWTH

Thinking about GROWTH is really about thinking about what the world WILL need in the next five, ten, twenty-five, fifty years … it’s primarily about thinking about your own growth, it’s not really so much about thinking about investment growth, but these two things should sort of be heading in the same general direction.

When we think about the kinds of companies that are about exemplary new GROWTH which happens because the product or service of that company is about creates something BRAND NEW that did not exist before the company started up for customers or users, we can look at the current unicorn companies to discern the TRENDS in terms what kinds of things get funded AND what kinds of things do well after funding. This TRAJECTORY can give individuals a good idea where they should aim to develop their skills for the kinds of things that will be in demand in five, ten or twenty years.

When it comes to thinking about what is needed most in the realm of open source, we first want to step back and really think deeply about the kinds of companies that are about exemplary new GROWTH which happens because the product or service of that company is about creates something BRAND NEW that did not exist before the company started up for customers or users.

The best example of these kinds of startups for lots of people to follow will probably be the current unicorn companies. It might be true that the average small investor cannot ordinarily participate by investing directly in these unicorn companies, but what we are after is generally studying the general trends or gist of ideas that are represented. When we want to understand the trajectory of what will be important in five or ten years, we can follow the population of unicorns over time to discern the TRENDS in terms what kinds of things get funded AND what kinds of things do well after funding. This TRAJECTORY can give individuals a good idea where they should aim to develop their skills for the kinds of things that will be in demand in five, ten or twenty years.

In other words, there’s nothing magic about the kinds of foundational mathematical libraries used by EDA tools or CAD software … but we generally know that these skillsets WILL BE in demand … but competition in this realm will be intense and, it’s sort of like professional football, there’s not really any room on the field for either the guy who just enjoys watching the game on teevee OR for the AlBundy-type who wants to dream about the time he scored four touchdowns in the City football championship decades ago, ie it doesn’t matter what you did in back in day, in the 2010s, 2000s, 1990s or before – there’s not any room on the field for teevee watchers or old guys reliving their glory days.

Working on open source development communities is all ABOUT WHAT YOU CAN DEMONSTRATE THROUGH YOUR OPEN SOURCE PORTFOLIO THAT YOU WILL BE ABLE DO FIVE YEARS or TEN YEARS FROM NOW … it’s entirely future-driven … NOT about looking back or being a spectator, it’s about training and getting yourself aware of trends so that you train for the next war, not the last one, and are ready to participate in FUTURE GROWTH.

When it comes to FUTURE GROWTH and publicly traded companies that investors can actually invest in … doubling your money every five years or less is about where you should aim for your investments to perform; in some cases, you can take on slightly more risks, but you should not take complete fliers or invest in wild-ass schemes with no sustained record of delivering bigger and bigger accomplishments. Above all, your INVESTMENTS should not fluctuate wildly or erratically because of governmental policy shifts or rumors of shifts, like Bitcoin which is a CASINO STOCK, because it is ENTIRELY dependent upon the whims and changing moods of governments and policy makers, ie there’s no GROWTH in terms of new accomplishments on the horizon from the Bitcoin universe … it will swing radically up and down because of Fed policy and government Dept of Treasury policy.

REAL growth looks and behaves very differently. Of course there are unicorn companies like Anduril that are growing rapidly by radically overhauling the way Dept of Defense technology is procured, but the average investor cannot participate in Anduril, unless that person is qualified for a career with kinds of talent that Anduril is recruiting. Other examples of REAL growth would be the growth in the technologies that underly the explosion in high performance computing necessary for AI … that would mean companies like NVIDIA or ASML; some people would include companies like Microsoft or Google or Apple. The best example of what is possible from PURE growth is company like NVIDIA which began 30 years ago (up 2163% in five years) or ASML which was founded forty years ago (up 439% in five years). Microsoft is not as much of a PURE growth play but it still has some growth left in it thanks to the Github ecosystem and now Copilot/Bing AI (up 264% in five years ). Even Google, although it’s mainly about SEARCH [and recommendation, through Gmail and Google Docs] and the monetization of search/recommendation with advertising is still involved in AI with its acquistion of DeepMind and its development of TPU tensor core chips and the rest of its Google Compute enterprise is still somewhat growthy (up 133% in five years, which is just ahead of the Nasdaq Composite Index at 114% over five years).

Broadly capable developers of talent that produce solid records of growth in their accomplishments are the companies outpace the market indices because they have really deep benches that are about more than just acquiring talent, but also feature solid systems for developing talent farther than that talent could develop on its own to produce dynamic product management teams that generate new products and services that just keep on growing and growing … the companies that you want to invest in grow over the very long term because they consistently demonstrate better results at using/developing talent in some sort of system that keeps developing and attracting talent because the products keep getting better and better … the companies that you want divest from are those that are no longer developing the talent produces brand new things for customers.

Let’s say that you wanted to become some sort of hardware verification engineer for company like NVIDIA … you will have to go a little bit beyond just being able to regurgitate Wikipedia pages for terms such as system-level RTL, SystemVerilog, Universal Verification Methodology, Synopsys VCS or equivalent HDL simulation tools like MentorGraphics ModelSim, Debussy debug tool, the GNU Debugger for debugging during hardware verification.

You kind of need to understand the how/why/what of electronic design automation (EDA) toolchains, including just enough of the history to be dangerous, ie to be able to understand the context of the tools that you will be using. As metal–oxide–semiconductor (MOS) technology progressed, millions and then billions of MOS transistors could be placed on one chip – which means, that well before, we entered the billion-transistor processor era over 20 years ago, integrated circuit designs really required the kind of exhaustively thorough planning and simulation tasks that would be impossible without modern EDA toolchains.

Now … let’s say … just for the sake of arguement that you were interested in help people get up to speed with open source EDA tools … knowing full well, that maximizing productivity from the best of the best designers will warrant the top-of-the-line proprietary EDA tools … but also knowing that the open source EDA tools are good enough for DEVELOPING A BASIC UNDERSTANDING OF ALMOST ALL OF THE WORK that will needs to be done.

It’s probably a good idea to BEGIN our discussion of the open source development communities that surround different open source EDA tools with a very-high level introduction to the basic EDA design flow that is typically used for Very Large Scale Integrated Circuits, Very High Speed Integrated Circuits, Ultra Large-Scale Integrated Circuits, Wafer-Scale Integration, System on a Chip, 3D Integrated Circuits, Multi-chip Modules, Flip Chip Multi-Chip Modules and an increasing variety of hardware … especially for anyone who has not had much experience in the realm of using an EDA design flow for producing any kind of integrated circuit such as microprocessors (CPUs), graphical processors (GPUs), tensor processors (TPUs), AI accelerators, ASICs, FPGAs and the vast array of other large combinations of integrated circuits.

After we have gotten lost deep in the weeds without really trying, we can see WHY it is really so easy for people to get really badly lost way deep in the weeds of comparing this tool vs that tool the EDA design flow … especially since the landscape is driven to be competitive and nobody is going to rest on their laurels, so the landscape evolves and evolves in an impossibly rapid fashion … so let’s GO BACK TO THE FOUNDATION and think about what will always be the commonalities … what HAS TO underpin all of the work that gets done … even with AI coding assistants and other AI tools to handle the gigantic volume of tedious chores and checks that have to be done, ie trillions or more tasks per day or per hour.

Proficiency with a debugger is always going to be ESSENTIAL … even though all will probably look like some derivative of the open source GNU Project Debugger, GDB and that includeds different advanced tools and information dashboards used in debugging and test. Debugging and test will be an order of magnitude greater with more automation and intelligent EDA tools … the old expression from 2020 “Code for 6 minutes. Debug for 6 hours” will be more like “Let AI generate code for 6 seconds. Debug for 6 days” with four dashboards full of output from different AI-enabled tools. It’s sort of easy to predict … when we automate the creation of things, testing and debugging are always going to be EVEN BIGGER part of the human-essential work, although the nature of how we use AI in test and debug tools will obviously changes with AI assistants in an EDA design flow.

The importance of debugging means that the ability to understand what’s happening under the hood … because underneath ALL of these executable specifications and EDA tools are going to be the mathematical software libraries that are the foundations of all features that are implemented in the more easily-used features that drive productivity of the integrated circuit designers. Any modern scientific or machine learning application or some compute-intensive part of the EDA chain will require an entangled combination of message passing channels (e.g., MPI), multi-threading which might be CPU-specific hyper-threading, and GPU-specific programming (e.g., CUDA) to execute.

That means having at least a passing familiarity with mathematical software libraries that underpin the EDA tools … being conversant with what those libraries do, how they do it is always going to be essential – just as having professional skills in programming is always going to be in demand somewhere, not quite, but almost in the same manner that having professional level skills in a trade like being electrician or being a plumber is going to be important … the skills that really matter are in integration and that usually means being able to understand the details, but see the larger picture and quickly SOLVE the whole problem in a cost-effective, standards-compliant manner … metaphorically speaking, it’s just about the porcelain interface or the visible fixtures, but also in getting the right pipes to the fixtures and also ensuring that the drains actually drain and stay unclogged. It’s really about the big picture and solving the root cause of the problem, rather than shooting from the hip and just part-swapping because you can swap a part that’ll fit … this will involve being able to understand the particular stakeholder needs of the larger situation as well as the budget, resources and time avaiable well enough to stabilize the situation as well as develop a longer-term project work breakdown structure, even if that means involving qualified subcontractors and/or a general contractor. Familiarity with mathematical software libraries and standards is just like plumbing … don’t underestimate how impossibly tough plumbing can be [especially when it has been done wrong], but realize that there’s a way to do it AND a way that it has to be done to interoperate with the rest of the world.

Understanding mathematical software libraries includes knowing something about the state of recent activity happening in the libraries as well as something about where the standards are moving in order to use a language like C++ or C WELL. The Trilinos Project is one such effort to facilitate the design, development, integration and ongoing support of mathematical software libraries and it illustrates how these different projects work together [or where the fragilities creep in from shared dependencies upon certain projects]; these projects tend to fork, but they do not really re-invent the wheels mostly. Trilinos, for example relies heavily on things like the Exascale Project’s Kokkos ecosystem and Kokkos C++ Performance Portability Programming Kokkos Kernels implements local, ie not using MPI, computational kernels for linear algebra and graph operations, using the Kokkos shared-memory parallel programming model. Conversely, the Kokkos Core implements a programming model in C++ for writing performance portable applications targeting all major HPC platforms [using MPI]. For that purpose it provides abstractions for both parallel execution of code and data management. Kokkos is designed to target complex node architectures with N-level memory hierarchies and multiple types of execution resources. Kokkos currently can use NVIDIA’s CUDA, AMD’s ROCm HIP, SYCL, HPX, OpenMP … and others that use C++ threads as backend programming models with several other backends under development.

Other notable players in the field of mathematical software libraries used in high-performance compute-intensive scientific programming includes SUNDIALS and the Hypre Project: Exascale-capable libraries for adaptive time-stepping and scalable solvers as well as Extreme-Scale Scientific Software Stack(E4S) of the US Dept Of Energy’s Exascale Computing Project, Portable, Extensible Toolkit for Scientific Computation (PETSc), pronounced PET-see, Multiphysics Object Oriented Simulation Environment (MOOSE), Open Source Field Operation and Manipulation (OpenFoam) for computational fluid dynamics, Distributed and Unified Numerics Environment(DUNE) … and hosts of other numerical libraries.

Since expertise in the application domain for different forms of sensors and instruments that interact with the world is going to important … the need for scripting languages and numerical data science scratchpads will be even more necessary than now, just as part of the toolkit that’s taken for granted, ie maybe more essential than being able to use a whiteboard or pen/notepad … accordingly, we should also mention the growing set of different Python libraries that are used for scientific computing, including the old favorites, NumPy, SciPy, Matplotlib, Pandas, SymPy, Jupyter Notebooks as well as the PyTorch, TensorFlow, Keras, Scikit-learn ecoystems … and the new proliferation of favorites like Dask, Xarray, Holoviews, Bokeh, Plotly, Dash, Streamlit, Altair, Seaborn, Statsmodels, Scikit-image, NetworkX, Gensim, NLTK, Spacy, TextBlob, Pattern, PyMC3, Edward, Pyro, ArviZ, Theano, Lasagne, Pymc-learn, et al This set of different data science scripting tools will NEVER stop expanding … IT CAN’T … because the application domain for sensors and instruments that interact with the world is always going to be GROWING, changing and the set of new uses will always be EXPANDING.