Tuesday, 27 November 2012
Interesting Snippets from 2012-11-27
-
Home · Netflix/Hystrix Wiki · GitHub
In a distributed environment, failure of any given service is inevitable. Hystrix is a library designed to control the interactions between these distributed services providing greater latency and fault tolerance. Hystrix does this by isolating points of access between the services, stopping cascading failures across them, and providing fallback options, all of which improve the system's overall resiliency. Hystrix evolved out of resilience engineering work that the Netflix API team began in 2011. Over the course of 2012, Hystrix continued to evolve and mature, eventually leading to adoption across many teams within Netflix. Today tens of billions of thread-isolated and hundreds of billions of semaphore-isolated calls are executed via Hystrix every day at Netflix and a dramatic improvement in uptime and resilience has been achieved through its use.
Interesting Snippets from 2012-11-14
-
Home · square/cube Wiki · GitHub
Cube is a system for collecting timestamped events and deriving metrics. By collecting events rather than metrics, Cube lets you compute aggregate statistics post hoc. It also enables richer analysis, such as quantiles and histograms of arbitrary event sets. Cube is built on MongoDB and available under the Apache License.
Interesting Snippets from 2012-11-13
-
Teaching Programming To A Highly Motivated Beginner | blog@CACM | Communications of the ACM
In the past year, many CS professors and education pundits have written about how MOOCs are scaling up CS education to hundreds of thousands of students. I'm going to take a different approach here and tell the story of how I spent nine months teaching computer programming to one student.
-
http - The definitive guide to forms based website authentication - Stack Overflow
Form based authentication for websites Please help us create the definitive resource for this topic. We believe that Stack Overflow should not just be a resource for very specific technical questions, but also for general guidelines on how to solve variations on common problems. Form
-
Measure Anything, Measure Everything « Code as Craft
StatsD is a simple NodeJS daemon (and by “simple” I really mean simple — NodeJS makes event-based systems like this ridiculously easy to write) that listens for messages on a UDP port. (See Flickr’s “Counting & Timing” for a previous description and implementation of this idea, and check out the open-sourced code on github to see our version.) It parses the messages, extracts metrics data, and periodically flushes the data to graphite. We like graphite for a number of reasons: it’s very easy to use, and has very powerful graphing and data manipulation capabilities. We can combine data from StatsD with data from our other metrics-gathering systems. Most importantly for StatsD, you can create new metrics in graphite just by sending it data for that metric. That means there’s no management overhead for engineers to start tracking something new: simply tell StatsD you want to track “grue.dinners” and it’ll automagically appear in graphite. (By the way, because we flush data to graphite every 10 seconds, our StatsD metrics are near-realtime.)
-
gurgeh/selfspy · GitHub
Selfspy is a daemon for Unix/X11 and (thanks to @ljos!) Mac OS X, that continuously monitors and stores what you are doing on your computer. This way, you can get all sorts of nifty statistics and reminders on what you have been up to. It is inspired by the Quantified Self-movement and Stephen Wolfram's personal key logging. See Example Statistics, below, for some of the fabulous things you can do with this data.
-
hadoop-20/src/contrib/corona at master · facebook/hadoop-20 · GitHub
Hadoop Corona is the next version of Map-Reduce. The current Map-Reduce has a single Job Tracker that reached its limits at Facebook. The Job Tracker manages the cluster resource and tracks the state of each job. In Hadoop Corona, the cluster resources are tracked by a central Cluster Manager. Each job gets its own Corona Job Tracker which tracks just that one job.
Interesting Snippets from 2012-11-07
-
ioerror/duraconf · GitHub
duraconf - A collection of hardened configuration files for SSL/TLS services Hopefully this will help you make a more informed choice about what cipher list should be used for different applications. What you find here are recommended configurations, you should seriously consider using these, but you have to make some choices.
-
Statwing | Intuitive Data Analysis
INTUITIVE DATA ANALYSIS Turn data into insight. In seconds.
-
Cyclops
Cyclops is a network audit tool for service providers and enterprise networks, providing a mechanism to compare the observed behavior of the network and its intended behavior. Cyclops is able to detect several forms of route hijack attacks, i.e. when Internet routes are maliciously diverted from their original state. Recent incidents such as the Youtube hijack in Feb'08 show that route hijacking is currently a real threat in the Internet.
Interesting Snippets from 2012-11-06
-
Process real-time big data with Twitter Storm
Storm is an open source, big-data processing system that differs from other systems in that it's intended for distributed real-time processing and is language independent. Learn about Twitter Storm, its architecture, and the spectrum of batch and stream processing solutions.
-
sysdream/pysqli · GitHub
PySQLi is a python framework designed to exploit complex SQL injection vulnerabilities. It provides dedicated bricks that can be used to build advanced exploits or easily extended/improved to fit the case.
-
research.microsoft.com/en-us/um/people/ryenw/papers/DiriyeCIKM2012.pdf
Users of search engines often abandon their searches. Despite the high frequency of Web search abandonment and its importance to Web search engines, little is known about why searchers abandon beyond that it can be for good or bad reasons. In this paper, we extend previous work by studying search abandonment using both a retrospective survey and an in-situ method that captures abandonment rationales at abandonment time. We show that although satisfaction is a common motivator for abandonment, one-in-five abandonment instances does not relate to satisfaction.
-
The geography of start-ups: Something in the air | The Economist
Economic theory suggests four main reasons why firms in the same industry end up in the same place. First, some may depend on natural resources, such as a coalfield or a harbour. Second, a concentration of firms creates a pool of specialised labour that benefits both workers and employers: the former are likely to find jobs and the latter are likely to find staff. Third, subsidiary trades spring up to supply specialised inputs. Fourth, ideas spill over from one firm to the next, as Marshall observed. But there are also forces that encourage companies to spread out. One is the cost of transporting their products to widely dispersed customers. Another is rents, which rise when firms cluster together.