Wednesday, 31 October 2012
Interesting Snippets from 2012-10-31
-
Why Things Fail: From Tires to Helicopter Blades, Everything Breaks Eventually | Wired Design | Wired.com
Product failure is deceptively difficult to understand. It depends not just on how customers use a product but on the intrinsic properties of each part—what it’s made of and how those materials respond to wildly varying conditions. Estimating a product’s lifespan is an art that even the most sophisticated manufacturers still struggle with. And it’s getting harder. In our Moore’s law-driven age, we expect devices to continuously be getting smaller, lighter, more powerful, and more efficient. This thinking has seeped into our expectations about lots of product categories: Cars must get better gas mileage. Bicycles must get lighter. Washing machines need to get clothes cleaner with less water. Almost every industry is expected to make major advances every year. To do this they are constantly reaching for new materials and design techniques. All this is great for innovation, but it’s terrible for reliability.
-
Fastly.com - Amazon S3 Traffic Overview
The graph below represents real-time response latency for Amazon S3 as seen by Fastly's Ashburn, VA edge server. Each column represents a histogram ranging from 0 to 4 seconds. Darker colors represent more traffic in a particular latency range and lighter colors represent less traffic.
Interesting Snippets from 2012-10-26
-
Home · metamx/druid Wiki · GitHub
Druid is a distributed, column-oriented analytical datastore. It was originally created to resolve query latency issues seen with trying to use Hadoop to power an interactive service.
-
Introducing Cloudera Impala - Cloudera Support
Cloudera Impala™ provides fast, interactive SQL queries directly on your Apache Hadoop data stored in HDFS or HBase. In addition to using the same unified storage platform, Impala also uses the same metadata, SQL syntax (Hive SQL), ODBC driver and user interface (Hue Beeswax) as Apache Hive. This provides a familiar and unified platform for batch-oriented or real-time queries.
Interesting Snippets from 2012-10-22
-
jq
jq is a lightweight and flexible command-line JSON processor.
Interesting Snippets from 2012-10-19
-
IBM's Watson Is Learning Its Way To Saving Lives | Fast Company
A few years ago, IBM’s new computer was a game-playing curiosity. Now Watson is poised to change the way human beings make decisions about medicine, finance, and work.
-
Geek philanthropy: Data huggers | The Economist
BUSINESSES avidly mine data to improve their efficiency. Non-profit groups have plenty of information, too. But they can rarely afford to hire number-crunchers. Now a bunch of philanthropic geeks at DataKind, a New York-based charity, are helping other do-gooders work more productively and quantify their achievements for donors, who like to see that their money is well spent.
A typical DataKind two-day “hackathon” last month in London attracted 50 people who worked in three teams. One pored over the records of Place2Be, which offers counselling to troubled schoolchildren. Crunching the data showed that boys tend to respond better than girls, though girls who lived with only their fathers showed the biggest improvements of all. The charity did not know that.
Interesting Snippets from 2012-10-18
-
Dark Social: We Have the Whole History of the Web Wrong - Alexis C. Madrigal - The Atlantic
This means that this vast trove of social traffic is essentially invisible to most analytics programs. I call it DARK SOCIAL. It shows up variously in programs as "direct" or "typed/bookmarked" traffic, which implies to many site owners that you actually have a bookmark or typed in www.theatlantic.com into your browser. But that's not actually what's happening a lot of the time. Most of the time, someone Gchatted someone a link, or it came in on a big email distribution list, or your dad sent it to you.
-
Open Data Structures
Open Data Structures covers the implementation and analysis of data structures for sequences (lists), queues, priority queues, unordered dictionaries, ordered dictionaries, and graphs.
Data structures presented in the book include stacks, queues, deques, and lists implemented as arrays and linked-lists; space-efficient implementations of lists; skip lists; hash tables and hash codes; binary search trees including treaps, scapegoat trees, and red-black trees; integer searching structures including binary tries, x-fast tries, and y-fast tries; heaps, including implicit binary heaps and randomized meldable heaps; and graphs, including adjacency matrix and ajacency list representations; and B-trees.
-
Data Wrangler
Wrangler is an interactive tool for data cleaning and transformation. Spend less time formatting and more time analyzing your data.