Category Archives: Systems

Machine-Learning Maestro Michael Jordan on the Delusions of Big Data and Other Huge Engineering Efforts – IEEE Spectrum

Interesting interview with Michael Jordan (no, not that Michael Jordan) about the the challenges with Big Data.  It reminds me of a conversation that I had with a good friend of mine just the other day.  We discussed how Big Data is meaningless in the absence of context.  Likewise, Michael Jordan’s point touches on the other side of the issues with Big Data and BD Analytics in such that meaning can be obtained from spurious connections, and that confidence can be over assumed.

A quote:

“If I have no principles, and I build thousands of bridges without any actual science, lots of them will fall down, and great disasters will occur.

Similarly here, if people use data and inferences they can make with the data without any concern about error bars, about heterogeneity, about noisy data, about the sampling pattern, about all the kinds of things that you have to be serious about if you’re an engineer and a statistician—then you will make lots of predictions, and there’s a good chance that you will occasionally solve some real interesting problems. But you will occasionally have some disastrously bad decisions. And you won’t know the difference a priori. You will just produce these outputs and hope for the best.”

via Machine-Learning Maestro Michael Jordan on the Delusions of Big Data and Other Huge Engineering Efforts – IEEE Spectrum.

2014-06B: From Around The Web: 5 Next Steps For Data Visualization | Blog

Not sure this is new news (see article at bottom of this post), but good to see it being articulated.

I would say the trends outlined below are already happening.  There is, however still a gap between the three dimensions of data visualization production:

  • The tools with many templates (Excel/Tableau) are not very creative,
  • The creative tools (Photoshop, Illustrator) are not able to change and adjust with new data / updated dated, and
  • Programming tools still have a bit of a learning curve for artists and others to get on board.

The one hope is with parameter based tools, such as NodeBox or this proof of concept from Brett Victor (and from which I borrowed the above articulation of the 3 dimensions).

Article Link: 5 Next Steps For Data Visualization | Blog.

2014-05: Atomic Design

This is a throw back from a few years ago, but read a news clipping this morning that reminded me of, so I wanted to get it out there.

When I was in my previous role, I used to use the concept of “products and by-products” for the work that we did.  The product is the deliverable, the by-product (even though it has a bad connotation), is the wast that wasn’t part of the final project deliverable,  but that can be reused for future projects.

Over the course of the 5 years that I was at that company, that thinking evolved a bit, or became a bit more nuanced, and became “Atoms, Molecules, Matter”.  Let me walk you through this a bit more.

Atoms: Atoms are the most fundamental unit of a project.  It’s a specific data element (e.g.: Age of population), or a specific functional component (e.g.: Text Input Form), that can be used over and over in different ways.  Atoms on their own don’t serve much purpose beyond being the building block for something else.

Molecules: Molecules are combinations of atoms that can be used.  They have more functionality than atoms and can be used to build up matter.  For example, you could use a a text input form combined with with a “search” button, and a piece of text that says “enter your search query here”, can now be seen as a search input molecule.  All the logic can be retained in that molecule.  But the molecule is only part of the answer, because you can only do a little with.

Matter: Different molecules can be brought together to form something that has a lot of functionality, and that is actually useful… that’s matter.  In the “search” example above, the front end search molecule (the form), and the back end that presents results (which also includes the algorithms to send out search requests) can now be seen as a search matter.  This “search” matter can now be used to search for information, and can be easily re-used without having to go back to the building block.

There’s got to be a fourth level, but I have not found its name yet…  Something that brings all the matter together.