• Shortcuts : 'n' next unread feed - 'p' previous unread feed • Styles : 1 2

» Publishers, Monetize your RSS feeds with FeedShow:  More infos  (Show/Hide Ads)


Date: Sunday, 13 Jul 2014 19:38
I just published a small library called ReactScriptLoader to make it easier to load external scripts with React. Feedback is appreciated! https://github.com/yariv/ReactScriptLoader
Author: "Yariv (noreply@blogger.com)"
Send by mail Print  Save  Delicious 
Date: Tuesday, 03 Jun 2014 11:29
I just released an open source a library called HDMNode. It’s a Node.JS based API server and client for hosted HDM (Hierarchical-Deterministic Multisig) Bitcoin wallets. If you’re interested in using it or contributing to it, please read on!

The goal of HDMNode is to make it easier for developers to build HDM wallets. Such wallets have significant security and privacy advantages over most popular wallet types and Bitcoin users (and the ecosystem) would benefit from a greater availability of high quality HDM wallet products. I believe there’s a dearth of HDM wallets because they’re fairly complicated to build. I hope that HDMNode will reduce the effort as well as provide developers a well audited and secure codebase on which they could safely rely. It would be a shame if every developer who wanted to build such a wallet had to face the same design issues and security pitfalls.

Why did I make HDMNode and why am I releasing it now? When I started working on this project I intended to build a complete HDM wallet product. However, as time went by I realized I had bit off a bit more than I could chew in the timeframe I had allotted to the project. While I made a lot of progress on the backend and the API client there was still a good amount of work to be done to make it production ready. In addition, the code needed many more eyeballs on it before I could be confident in its security. I didn’t want to risk shipping a product that holds people’s money with glaring flaws that I failed to catch. So, I decided that the best course of action was to open source the code and give other developers an opportunity to inspect the code, to use it in their own products, and to hopefully contribute back to the project.

While I expect HDMNode to be mostly used by hosted wallet providers, if HDMNode evolves into a complete open source wallet (with UI) it could give users who want to protect their privacy the option to host their wallet on their own servers. I don’t expect this to be the primary use case but I also don’t think that users must be forced to choose between security and privacy. With HDMNode, they could have both.

If HDMNode gains traction, I hope that its JSON-RPC based API will be standardized, allowing users to mix and match clients and servers that they trust and want to use.

What’s makes HDM wallets so great, anyway? HDM wallets’ strong security comes from their reliance on P2SH multisig addresses (as defined in BIP11 and BIP16). Such wallets store coins in addresses that are guarded by multiple private keys, each of which is generated on different machine. The typical setup is 2-of-3, where the client, the server, and a backup machine each have a key, and at least 2 keys are required to sign off on every transaction. This is far more robust than non multisig wallets, where the machine that holds the key that protects the coins becomes a single point of failure from a security perspective: if that machine gets compromised, the coins are gone.

HDM wallets also offer much better privacy than no-HD multisig wallets. Such wallets rely on a fixed set or subset of keys to generate P2SH addresses. Anyone observing the blockchain could link those addresses to each other (at least after their coins are spent) and can therefore derive the user’s balance and transaction history. HDM wallets don’t have this weakness because they can generate an arbitrary number of addresses, each made from a unique set of keys, from a single randomly generated seed, as defined in BIP32. Without knowing the wallet’s seed, it’s impossible to associate those addresses with each other.

HDM wallets have a couple of additional benefits shared with their non-multisig HD counterparts. They make it easy for users to back up their wallet once by backing up the wallet’s seed and restore it fully at a later point regardless of the number of transactions the user has performed. This is possible because having the wallet’s seed allows you to scan the blockchain and find all the transactions that sent or received coins from or to addresses that can be derived from that seed. HDM wallets also allow users to set up a hierarchical tree of sub-wallets, where having the parent wallet’s keys allows you to derive the child wallet’s keys but not vice versa. This feature can be useful for organizations or groups who want to give some members limited ability to spend the organization’s coins or observe incoming transactions to other branches of the tree.

HDM wallets have real benefits, but, as you might have guessed, they’re not perfect. Besides their implementation complexity, the main downside of HDM wallets is the initial added friction when creating the wallet, at least compared to pure hosted wallets. Users have to pick a strong password and remember it (no password recovery!). If they forget their password or fail to properly back up their keys, they can lose their coins. Also, to get the full security benefits of HDM wallets users should also set up the backup key pair on a separate machine (ideally an offline one). If a user doesn’t do that, and her machine is compromised at wallet creation time, a hacker could steal her coins once they’re deposited into the wallet. Despite this weakness (which users can avoid without too much effort) HDM wallets still much secure than client side wallets that expose the keys that guard the coins every time the user transacts. Such wallets expose the keys every time the user sends money, making their vulnerability window much bigger.

HDMNode is currently designed for supporting 2-of-3 multisig wallets, with one key on the server, one key on the client, and one key in backup (ideally offline), which I expect to be the most popular option for HDM wallets. This setup combines the best security features of wallets that store private keys client side and hosted wallets that store private keys on the server. In HDMNode, the coins are safe whether the client or the server gets hacked (but not both). If the server disappears or becomes inaccessible, the user can recover her coins using the backup (offline) key. An attacker must compromise at least 2 of these different systems to steal the user’s coins. While the server can’t steal the coins, it can act as a security service for the client by enforcing 2 factor auth and by refusing to sign off on transactions that seem suspicious or that are against user-defined rules such as daily spend limits. This protects the user against attacks where the attacker gains control over the user’s device and tries to steal the user’s coins by sending spend requests to the server.


If you’re sold on HDM wallets and you want to build one, I hope you use HDMNode. I’ll be happy to take contributions from anyone who wants to make Bitcoin wallets more secure and trusted!
Author: "Yariv (noreply@blogger.com)"
Send by mail Print  Save  Delicious 
Date: Tuesday, 03 Jun 2014 10:37
(This was originally posted at https://medium.com/@yarivs/bitcoins-money-supply-and-security-10cbf87ce39e.)

It’s evident that Bitcoin has been designed to reward hoarding by its early investors. It’s encoded in the protocol that the supply of new bitcoins will gradually diminish, halving every four years until the last bitcoin will be mined in 2140. In its first four years, each Bitcoin block rewarded the miner with 50 btc. Today, the reward is 25 btc; in 3-4 years, it will be 12.5 btc, and so on.

Furthermore, mining bitcoins used to be much more accessible. In the early days of the network the hash rate (the global hashing power dedicated to mining bitcoins) was much lower. Since less hashing power was competing for the discovery of new bitcoins it used to be possible to obtain a large number of new bitcoins by mining with commodity hardware. Today, however, it’s difficult to mine profitably without custom-built, bleeding edge ASICs and cheap electricity. The combination of cheap price, high block rewards, and low competition in mining has allowed the earliest bitcoin buyers and miners to accumulate large stakes of the currency.

The scheme seems to have worked as planned. Bitcoin’s price rose from ~$14 in early 2013 to more than $1,200/btc near the end of last year (it’s now hovering around $700), yielding fantastic returns to those who accumulated a large stake of bitcoins early in the game.

The sharp rise in bitcoin’s price has led many people to call Bitcoin a bubble or Ponzi scheme. I believe that the Ponzi scheme accusation is misplaced because there’s no apparent intent to enrich early investors at the expense of late adopters or to cause late adopters any losses, which is characteristic of Ponzi schemes. In addition, while Bitcoin’s price has been undeniably volatile, whether it’s bubble or not remains to be seen. If people see Bitcoin as a reliable store of value, like gold, it’s quite possible its price will increase over time. Of course, it’s also conceivable that its price will plummet if, say, someone invents an alternative crypto currency that’s better than Bitcoin in every way and users adopt it in droves.

Whether Bitcoin’s skewed wealth distribution is fair or not is worth debating. I think that there are merits to both sides of the argument. Early investors should be rewarded for taking a risk, but if Bitcoin keeps appreciating it could be problematic that so much of its wealth should be concentrated in the hands of a few people. What is clear to me, however, is that if Bitcoin hadn’t rewarded hoarding by early investors, Bitcoin would have been much more vulnerable. In fact, it may not have been a viable new cryptocurrency at all.

The reason is that the hoarding behavior indirectly gives miners a much needed incentive to secure the network in its early days. Bitcoin can only be secure when a large amount of mining power is spent protecting the network from a 51% attack (an attack where a single entity controls 51% of the mining power and is then able to essentially rewrite the history of the blockchain and thereby launch double spend attacks). For a completely decentralized system designed for moving money, Bitcoin’s security so far has been remarkably strong. It’s unclear exactly how much it would cost to launch a 51% attack against bitcoin but I’ve heard estimates from a few hundred million to over a billion dollars. Regardless of what the actual number is, the aggregate amount of mining power that’s dedicated to securing the network is very high and that gaining control over 51% of it to launch such attack would be very expensive.

Miners aren’t volunteering their computers to the network altruistically. They have a dual incentive to mine: every time they mine a new block they earn new coins as well as transaction fees from anyone whose transaction is contained in the block. Five years into Bitcoin’s creation the transaction fees that miners make are still quite small of the amount miners earn in “block rewards” through minting new coins (only 0.29% according to https://blockchain.info/stats).

Earnings from any given block are determined by the total market cap for Bitcoin at the moment in time when the block is mined and the basic forces of supply and demand are what determine the market cap. The demand for bitcoins is driven by new investors as well as by users who want to acquire bitcoins in order to send them to other people or to buy things. The supply is driven by miners who use the newly minted coins to pay for their operations as well as by investors who sell their coins.

If Bitcoin’s incentives were inverted and investors were encouraged to sell their coins rather than hoard them, the supply of bitcoins would increase and the price would drop. Miners would lose a proportional incentive to contribute to the network the computational power that’s needed to secure it.

When early investors hoard their coins they’re propping up the price, which increases Bitcoin’s market cap and attracts more mining power to protect the blockchain. This serves the function of priming the network’s mining power in its early days before enough transactions go through the network to provide large numbers of miners with sufficient transaction fees to continue mining profitably. Because the minting of new coins dilutes the ownership stake of early investors, early investors who hoard their bitcoins are arguably paying (indirectly) for bitcoin’s security ahead of Bitcoin’s readiness to become used as a currency by large numbers of people. In fact, according to https://blockchain.info/stats, the current price per transaction (measured by dividing miner revenues by number of transactions) is $34.68, most of which is paid for almost exclusively by the block reward, i.e., existing bitcoin holders.

The current high price per transaction seems unsustainably high — and it is. To ultimately succeed, bitcoin must see growing transaction volumes and a gradual inversion of the relationship between transaction fees and block rewards. Barring these trends, it’s likely that mining revenue will decline, causing miners to eventually drop out of the network. If a large portion of miners do so the network’s security will be at risk, causing investors to flee, the price to drop, block rewards to depreciate, more miners to leave, etc, in a downward spiral.

While this dystopian scenario is possible, it’s unlikely in the near term: Bitcoin is only five years old and, by design, miners will continue being rewarded with new bitcoins for over a century. Until then, Bitcoin has plenty of time to gain popularity as a transactional currency. This will require much infrastructure to be built, from exchanges to wallets to payment providers, but much of the work is already under way. It will also require greater acceptance by merchants and users.

Thinking about this makes me appreciate the cleverness of Bitcoin’s design. On the surface, it offers a novel solution to the problem of distributed consensus (aka the Byzantine General’s Problem). But beyond that, it’s also a very carefully orchestrated economic system that would have failed very easily—and quickly—if its creator(s) hadn’t so deliberately aligned the incentives of investors, miners and users, seemingly with an eye towards maximizing the network’s security throughout its stages of adoption.

Such conceptual cleverness, however, cannot guarantee lasting success in the real world, where it remains to be seen whether Bitcoin can withstand market, regulatory, and competitive forces. It’s conceivable that someone will invent an altcoin that’s even better optimized to rewarding miners (for example, an altcoin with a perpetual inflation rate that’s high enough to give miners significant additional revenues but low enough to not scare away investors). If this happens, and this altcoin one day surpasses Bitcoin in mining power, will Bitcoin remain relevant? It’s hard to say, so grab some popcorn and enjoy the ride.

(Full disclosure — I own some bitcoins, which I bought in late 2013.)
Author: "Yariv (noreply@blogger.com)"
Send by mail Print  Save  Delicious 
Date: Tuesday, 03 Jun 2014 10:37
(This was originally posted a few months ago at https://medium.com/on-banking/how-to-secure-your-bitcoins-29b86892fc64.)

In the past few months, I’ve spent a good amount of time investigating different solutions for Bitcoin storage. I’m writing this to share the knowledge I’ve gained and to help you make informed choices about securing your bitcoins. I won’t cover all of the products in this space — that would require a much longer post — just the ones that I think are relevant to the average user.

Before I go into the details I want to emphasize that great solutions to this problem don’t exist. Every solution involves different security/usability tradeoffs. The more usable ones put your coins at greater risk of theft and the more secure put your coins at greater risk of loss — it’s indeed possible to store your coins so securely that you’ll end up securing them from yourself. With this in mind, I’ll walk you through the options that I think strike the right balance for most people. The only condition for my advice is that if you follow it and you end up losing your coins you won’t blame me for it!

If you have a small amount of coins, if you need them available online for day-to-day spending, or if you want a user friendly option use Coinbase or Blockchain.info. They’ve both been around for a few years — an eternity in Bitcoin terms. They both have mobile apps, at least for Android, but iOS users are currently out of luck (sadly, Apple has removed all Bitcoin wallets from the App Store). They allow you to easily access your account from multiple machines. They provide two factor auth, an important requirement for online wallet security. Coinbase also has daily spend limits after which a second two factor auth check kicks in, which is a nice security feature.

The main difference from a security perspective between Coinbase and Blockchain.info is that Coinbase holds the private keys for your bitcoins on their servers whereas Blockchain.info encrypts them on the client with your password and only stores the encrypted keys on the server.

As a consequence, if you use Coinbase and Coinbase gets hacked or disappears tomorrow due to some calamity you will lose your coins.

This may sound alarming, but I believe the probably this would happen is low. Coinbase’s team is competent, they follow strong security practices, and they’re backed by some of the best VCs in the industry. However, the risk of loss for Coinbase does exist, so I wouldn’t recommend putting in Coinbase a significant chunk of your life savings.

(Remember: no Bitcoin wallet has the equivalent of FDIC insurance like a bank account. Once the coins are gone, they’re gone.)

Using Blockchain.info protects you from their disappearing or getting hacked. If you keep a wallet backup you can decrypt your private keys locally as described here. However, you can still lose your coins if your phone or computer gets hacked or if an attacker gets his or her hands on your encrypted wallet and you’ve chosen a weak password. This would let the attacker brute force your private keys and steal your coins.

(When using Bitcoin wallets always choose a secure password. It should be long, it should contain a combination of letters, numbers and symbols, and it should be unique. Also, never reuse passwords because it could seriously compromise their security.)

If you do use Blockchain.info you should avoid using their web based wallet. Only use their native apps or browser extensions (preferably the Chrome app), which you can download at https://blockchain.info/wallet/browser-extension. This is because Javascript cryptography in the browser is inherently non secure. In fact, you should never trust any wallet that is web based.

Open source is another important consideration. Blockchain.info has open sourced all of their client wallet code, so its security could be vetted by experts. This is a baseline requirement for any wallet application that directly handles private keys.

Coinbase has also open sourced their Android and now delisted iOS app but this is less relevant because as opposed to Blockchain.info their client apps don’t touch private keys. Nonetheless, Coinbase should be given credit for open sourcing their apps as it gives security experts the ability to at least rule out some possible attacks.

This covers the options that I consider user friendly yet still decently secure. Let’s move on to the options that would give you much greater control over the security of your coins at the cost of much greater complexity. As you’ve probably guessed, this involves putting them in cold storage.

The two main contenders in this arena are Armory and Electrum. They both let you generate your private keys on an offline machine and only transfer your public keys to an online machine, where they can receive bitcoins but but not send them. These clients are both deterministic, which implies that all their private keys could be generated from a single seed — a randomly generated 128 bit number or easily remembered passphrase— which makes them easy to back up. Being deterministic also also allows them to generate new receiving addresses from the seed’s public key. This is an important feature because in most cases you don’t want to reuse addresses when receiving bitcoins in order to protect the privacy of your wallet balance on the public blockchain.

The main difference between Armory and Electrum is that Armory downloads the full blockchain (Armory uses bitcoind as its backend) and Electrum uses a third party server to only receive information about the addresses it holds. This makes Armory slow, private, and secure and Electrum fast, less private and somewhat less secure. Electrum is less private because the remote server to which your client connects knows your IP address and which addresses your wallet requested. It’s also less secure because a malicious remote server could lie to the client about its bitcoin balance. However, this weakness doesn’t compromise your private keys, which is the primary concern for offline wallets.

Of course, both products are open source, which is a baseline security requirement for client side wallets.

MultiBit is another noteable option because, like Armory, it relies on the P2P network to query the state of the blockchain. MultiBit is also fast because it only queries the subset of the blockchain’s blocks that are relevant to the addresses in the wallet (this is called SPV mode). This makes MultiBit more user friendly than Armory at the cost of some security because an attacker that controls the internet connection could feed the client false information about the wallet’s balance, similar to a malicious Electrum server.

MultiBit is also more private than Electrum because rather than querying the addresses it cares about directly it queries them using a bloom filter that matches a superset of those addresses. However, I don’t know precisely how much privacy this would give you because presumably the remote nodes could infer which addresses the user owns within some confidence interval. Maybe someone who’s an expert in SPV mode could expand on this.

I don’t recommend MultiBit at the moment because it’s not deterministic, which makes it harder to back up. The developers announced that a future release will enable deterministic wallets, at which point MultiBit will become a strong contender for secure yet usable offline bitcoin storage.

The most important thing you have to remember before you embark on this cold storage security journey (and it is a serious journey, as you’ll soon see), is that if your private keys ever touch a machine that’s connected to the internet you should assume they’re compromised and that your coins will be stolen.

This is because, if you’re truly paranoid, you should know that computers fundamentally cannot be trusted. It’s impossible to know what really goes on within a computer. It could have viruses, rootkits, software vulnerabilities, and even compromised hardware. If that computer has access to your private keys and it can send them over the internet, whoever controls your computer can steal your bitcoins.

With this warning in mind, let’s walk through the steps you should take to set up secure cold storage. I’m going to describe the Electrum method because I’m more familiar with it but Armory should be similar.

1) Get an old computer you won’t use for anything else. Reformat it and and install Linux on it. I recommend either Debian or Ubuntu. Make sure this computer never connects to the internet.

(When you format this machine, you should ideally encrypt your partitions — including your swap parition — for extra security. Otherwise, your OS could inadvertently write your private keys unencrypted to disc while it swaps them out of memory, allowing an attacker that gets his hands on the machine to steal your private keys.)

2) Download Electrum onto your online machine and copy it to your offline machine from a USB drive.

(Note that even USB drives can carry viruses, so it’s recommended to use a new USB drive that hasn’t touched any other machines. However, those viruses are unlikely to infect Linux machines, so if you followed step 1 you should be fairly safe).

3) Verify the binary’s GPG signature before you install it (these MultiBit instructions should apply to Electrum users too). Follow the installation instructions to install Electrum on both machines.

4) On the offline machine, create a new wallet. Choose a strong password for encrypting this wallet on disk. Write the wallet’s seed on a piece of paper.

5) Copy the public key from the offline machine to the Electrum client running on your online machine. At this point, Electrum on the online machine should be able to generate public keys for receiving bitcoins but wouldn’t be able to spend them it wouldn’t have access to the private keys, which are safely store exclusively on the offline machine. (This process is described in more detail here.)

6) As a test, send a small amount of coins to the first few addresses generated by the wallet. Then create a new wallet on the online machine, enter the seed from the offline machine, and verify your wallet has been completely restored and that you can spend the funds. Send the coins back to the the online wallet from which you sent them to verify Electrum could send them successfully.

7) If everything worked as expected, repeat steps 4-5. You should repeat those steps because the moment you’ve entered your seed into the online machine you could have compromised it so you shouldn’t use it anymore to protect your offline keys.

(If you’re truly paranoid you should should use something like Diceware http://world.std.com/~reinhold/diceware.html because there’s a chance your computer’s random number generator isn’t going to generate sufficent entropy. It sounds crazy, but this kind of bug has happened before with certain Android wallets.)

8) Send your remaining coins to the new wallet you created. Store the seed on paper in a secure place, or in multiple secure places as you see fit (e.g. in safes, bank vaults, or whereever else you feel safe).

9) For even stronger security of your paper backups, you should consider generating two factor paper backups for your wallet’s seeds by encrypting them with your password as described in BIP38. This would prevent anyone that gets your paper backups from stealing your coins if they don’t also know your password. The process for doing this will be left as an exercise to the reader.

If you’ve read this far, this is a good time to stop and ask yourself: are you sure you still want to own bitcoins? ☺

As you can see, Bitcoin security is quite complicated and hard to get right, even for people that have the understanding and the patience to put their coins in offline storage. In my opinion, this is one of the main reasons that Bitcoin isn’t quite ready for the mainstream.

That said, the future for Bitcoin security looks bright. New wallets that use multisig addresses to protect bitcoins behind multiple private keys are coming, and once they’re vetted by the experts I’ll update this post or write a new post with my latest recommendations. I’ve also started working on a project that I hope will improve security for Bitcoin users but I‘m not ready to make any promises or announcements about it. ☺

Got feedback? Join the conversation or follow me on Twitter.
Author: "Yariv (noreply@blogger.com)"
Send by mail Print  Save  Delicious 
Date: Monday, 02 Jun 2014 19:01
In my last post I described how to use LFE to overcome some of the weaknesses of parameterized modules. Unfortunately, all is not rosy yet in the land of LFE types. Parameterized modules allow you to create only static types. The compiler doesn't do static type checking, but you have to define the properties of your types at compile time. This works in many cases, but sometimes you want totally dynamic containers that map keys to values. In Erlang, this is typically done with dicts. We could still use them with LFE, but I don't like having different methods of accessing the properties of objects depending on whether their types were defined at run time or compile time.

Let's use macros to solve the problem.

In my last post, I relied on the build-in 'call' function to access the properties of static objects. Let's create a wrapper to 'call' that lets us access the properties of dicts in exactly the same manner as we do properties of other objects:

We can use 'dot' to get and set the properties of both dicts and static objects:



> (: dog test)
(lola rocky)


I think this is kinda of cool, though to be honest I'm not entirely sure it's a great idea to obfuscate in the code whether we're dealing with dicts or static objects.
Author: "Yariv (noreply@blogger.com)"
Send by mail Print  Save  Delicious 
Date: Thursday, 20 Jun 2013 17:01
Mortgages can be quite confusing and comparing them can be difficult. That's why I created the Mortgage Hacking Calculator at http://www.mortgagehacking.com. Please check it out and let me know of any feedback you may have. I hope you find it useful!
Author: "Yariv (noreply@blogger.com)"
Send by mail Print  Save  Delicious 
PicLike   New window
Date: Tuesday, 19 Oct 2010 23:32
Check out PicLike, the new app I made using the Flickr API, Google App Engine and the Facebook Like button: http://piclike.appspot.com.
Author: "Yariv (noreply@blogger.com)"
Send by mail Print  Save  Delicious 
Date: Tuesday, 19 Oct 2010 23:29
Erlang is *almost* a tipping point. Thanks to reddit, many people are interested in it. However, it's not there yet. Despite being the only language that got concurrency right, and all its other standout features, many developers still use other languages. The only explanation I can think of is that Erlang hasn't had much PR over the years (the Erlang movie nonwithstanding). I'm confident that a good PR boost will help push Erlang over the hill. Unfortunately, Ericsson doesn't seem interested in heavily promoting Erlang the way Microsoft promotes .NET and Sun promotes Java (this may be because many Ericsson employees have never heard of Erlang). So, I decided to take things into my own hands. I don't have a budget, so I need to get creative. The best way to succeed if you're small is to ride a big wave -- and what's a better wave to ride than Ruby on Rails?

Ruby on Rails is very popular -- much more than ErlyWeb. I believe this popularity is due to the "on Rails" meme, which is just bursting with positive connotations. It sounds young, fresh, happy. It's the anti-enterprisy software. It emancipates you from burdensome type systems, explicit getters and setters, and (ugh) XML. Its metaprogramming wizardry is made of bliss. Its evokes images of riding in environmentally-friendly transportation looking out the window at grassy meadows, rolling hills and sunny skies.

I think that renaming ErlyWeb to "Erlang on Rails" will help win over the hearts and minds of many programmers who are currently on the fence. They may be curious about Erlang but are turned off by its telcom image. "Erlang on Rails" conveys a more balanced feeling of industrial strength applications from the telcom world mixed with the social Web 2.0 era of interconnectedness that celebrates the rise of individualism over grey corporate culture.

2008 will be the year of Erlang on Rails. I know it.

Update: This was an April Fool's joke, in case it's not obvious anymore :)
Author: "Yariv (noreply@blogger.com)" Tags: "erlyweb"
Send by mail Print  Save  Delicious 
Date: Tuesday, 19 Oct 2010 23:29
Nick Gerakines, the Author of Facebook Application Development, created with ErlyWeb the very cool Facebook app I Play WoW. I Play WoW bridges between real people and the characters they play on World of Warcraft. Nick told me he got a lot of feedback such as "Wow! I didn't know my brother-in-law is in my guild!" and "Its been 5 years since I talked to some of them, but a bunch of my friends from school play on these realms and I didn't even know they played".

Some facts:
- 53.5k installs
- 2.9k daily active users
- 1.3k application fans
- 200+ new users a day on average
- In the past 30 days its gotten over 2 million page views where users spend more than 5 minutes on average on the application
- Erlang application layout:
* Charstore w/ Mnesia: Acts as the raw character store and cache for interactions between wowarmory.com
* I Play WoW w/ ErlyWeb + Mnesia: The front-end and ui for the application. The majority of the FB API calls are made here or are spawned from here.
* There is still one component in perl that is yet to be ported over, mainly due to not having enough time. Its on the list of things to do.

(My note: it sounds like Nick is also using spawned processes to make FB API calls asynchronously. It's a great technique for reducing page load time and avoiding the annoying timeouts Facebook imposes on page renderings.)

If you play World of Warcraft (an addiction I've luckily been able to avoid this far :) ) and you are on Facebook, give I Play WoW a try. You may discover that your boss is a level 10 ogre or something :)

Congrats, Nick, for creating such a successful app with ErlyWeb!
Author: "Yariv (noreply@blogger.com)" Tags: "erlyweb"
Send by mail Print  Save  Delicious 
Date: Tuesday, 19 Oct 2010 23:29
Damien Katz's latest blog post lists some ways in which Damien Katz thinks Erlang sucks. I agree with some of these points but not with all of them. Below are my responses to some of his complaints:

1. Basic Syntax

I've heard many people express their dislike for the Erlang syntax. I found the syntax a bit weird when I started using it, but once I got used to it it hasn't bothered me much. Sometimes I mess up and use the wrong expression terminator, and sometimes things break when I cut and paste code between function clauses, but it hasn't been real pain point for me. I understand where the complaints are coming from, but IMHO it's a minor issue.

Since the release of LFE last week, if you don't like the Erlang syntax, you can write Erlang code using Lisp Syntax, with full support for Lisp macros. If you prefer Lisp syntax to Erlang syntax, you have a choice.

2. 'if' expressions

The first issue is that in an 'if' expression, the condition has to match one of the clauses, or an exception is thrown. This means you can't write simple code like


if Logging -> log("something") end


and instead you have to write


if Logging -> log("something"); true -> ok end


This requirement may seem annoying, but it is there for a good reason. In Erlang, all expressions (except for 'exit()') must return a value. You should always be able to write

A = foo();

and expect A to be bound to a value. There is no "null" value in Erlang (the 'undefined' atom usually takes its place).

Fortunately, Erlang lets you get around this issue with a one-line macro:


-define(my_if(Predicate, Expression), if Predicate -> Expression; true -> undefined end).


Then you can use it as follows:


?my_if(Logging, log("something"))


It's not that bad, is it?

This solution does have a shortcoming, though, which is that it only works for a single-clause 'if' expression. If it has multiple clauses, you're back where you started. That's where you should take a second look at LFE :)

The second issue about 'if' expressions is that you can't call any user- defined function in the conditional, just a subset of the Erlang BIFs. You can get around this limitation by using 'case', but again you have to provide a 'catch all' clause. For a single clause, you can simply change the macro I created to use a case statement.


-define(case(Predicate, Expression), case Predicate -> Expression; _ -> undefined end).


For an arbitrary number of clauses, a macro won't help, and this is something you'll just have to live with. If it really bothers you, use LFE.

3. Strings

The perennial complaint against Erlang is that it "sucks" for strings. Erlang represents strings as lists of integers, and for some reason many people are convinced that this is tantamount to suckage.


...you can't distinguish easily at runtime between a string and a list, and especially between a string and a list of integers.


A string *is* a list of integers -- why should we not represent it as such? If you care about the type of list you're dealing with, you should embed it in a tuple with a type description, e.g.


{string, "dog"},
{instruments, [guitar, bass, drums]}


But if you don't care what the type is, representing a string as a list makes a lot of sense because it lets you leverage all the tools Erlang has for working with lists.

A real issue with using lists is that it's a memory-hungry data structure, especially on 64 bit machines, where you need 128 bits = 16 bytes to store each character. If your application processes such massive amounts of string data that this becomes a bottleneck, you can always use binaries. In fact, you should always use binaries for "static" strings on which you don't need to do character-level manipulation in code. ErlTL, for example, compiles all static template data as binaries to save memory.


Erlang string operations are just not as simple or easy as most languages with integrated string types. I personally wouldn't pick Erlang for most front-end web application work. I'd probably choose PHP or Python, or some other scripting language with integrated string handling.


I disagree with this statement, but instead of rebutting it directly, I'll suggest a new kind of Erlang challenge: build a webapp in Erlang and show me how string handling was a problem for you. I've heard a number of people say that Erlang's string handling is a hinderance in building webapps, but by my experience this simply isn't true. If you ran into real problems with strings when building a webapp, I would be very interested in hearing about them, but otherwise it's a waste of time hypothesizing about problems that don't exist.

4. Functional Programming Mismatch

The issue here is that Erlang's variable immutability supposedly makes writing test code difficult.


Immutable variables in Erlang are hard to deal with when you have code that tends to change a lot, like user application code, where you are often performing a bunch of arbitrary steps that need to be changed as needs evolve.

In C, lets say you have some code:

int f(int x) {
x = foo(x);
x = bar(x);
return baz(x);
}

And you want to add a new step in the function:

int f(int x) {
x = foo(x);
x = fab(x);
x = bar(x);
return baz(x);
}

Only one line needs editing,

Consider the Erlang equivalent:

f(X) ->
X1 = foo(X),
X2 = bar(X1),
baz(X2).

Now you want to add a new step, which requires editing every variable thereafter:

f(X) ->
X1 = foo(X),
X2 = fab(X1),
X3 = bar(X2),
baz(X3).



This is an issue that I ran into in a couple of places, and I agree that it can be annoying. However, discussing this consequence of immutability without mentioning its benefits is missing a big part of the picture. I really think that immutability is one of Erlang's best traits. Immutability makes code much more readable and easy to debug. For a trivial example, consider this Javascript code:


function test() {
var a = {foo: 1; bar: 2};
baz(a);
return a.foo;
}


What does the function return? We have no idea. To answer this question, we have to read the code for baz() and recursively descend into all the functions that baz() calls with 'a' as a parameter. Even running the code doesn't help because it's possible that baz() only modifies 'a' based on some unpredictable event such as some user input.

Consider the Erlang version:


test() ->
A = [{foo, 1}, {bar, 2}],
baz(A),
proplists:get_value(A, foo).


Because of variable immutability, we know that this function returns '1'.

I think that the guarantee that a variable's value will never change after it's bound is a great language feature and it far outweighs the disadvantage of having to use with unique variable names in functions that do a series of modifications to some data.

If you're writing code like in Damien's example and you want to be able to insert lines without changing a bunch of variable names, I have a tip: increment by 10. This will prevent the big cascading variable renamings in most situations. Instead of the original code, write


f(X) ->
X10 = foo(X),
X20 = bar(X10),
baz(X20).


then change it as follows when inserting a new line in the middle:


f(X) ->
X10 = foo(X),
X15 = fab(X10),
X20 = bar(X15),
baz(X20).


Yes, I know, it's not exactly beautiful, but in the rare cases where you need it, it's a useful trick.

This issue could be rephrased as a complaint against imperative languages: "I don't know if the function to which I pass my variable will change it! It's too hard to track down all the places in the code where my data could get mangled!" This may sound outlandish especially if you haven't coded in Erlang or Haskell, but that's how I really feel sometimes when I go back from Erlang to an imperative language.


Erlang wasn't a good match for tests and for the same reasons I don't think it's a good match for front-end web applications.


I don't understand this argument. Webapps need to be tested just like any other application. I don't see where the distinction lies.

5. Records

Many people hate records and on this topic I fully agree with Damien. I think the OTP team should just integrate Recless into the language and thereby solve most of the issues people have with records.

If you really hate records, another option is to use LFE, which automatically generates getters and setters for record properties.

Incidentally, if you use ErlyWeb with ErlyDB, you probably won't use records at all and you won't run into these annoyances. ErlyDB generates functions for accessing object properties which is much nicer than using the record syntax. ErlyDB also lets you access properties dynamically, which records don't allow, e.g.


P = person:new_with([{name, "Paul"}]),
Fields = person:db_field_names(),
[io:format("~p: ~p~n", [Field, person:Field(P)]) || Field <- Fields]


Records are ugly, but if you're creating an ErlyWeb app, you probably won't need them. If they do cause you a great deal of pain, you can go ahead and help me finish Recless and then bug the OTP team to integrate it into the language :)

6. Code oragnization


Every time time you need to create something resembling a class (like an OTP generic process), you have to create whole Erlang file module, which means a whole new source file with a copyright banner plus the Erlang cruft at the top of each source file, and then it must be added to build system and source control. The extra file creation artificially spreads out the code over the file system, making things harder to follow.


I think this issue occurs in many programming languages, and I don't think Erlang is the biggest offender here. Unlike Java, for instance, Erlang doesn't restrict you to defining a single data type per module. And Ruby (especially Rails) applications are also known for having multitudes of small files. In Erlang, you indeed have to create a module per gen-server and the other 'behaviors' but depending on the application this may not be an issue. However, I don't think there's anything wrong with keeping different gen-servers in different modules. It should make the code more organized, not less.

7. Uneven Libraries and Documentation

I haven't had a problem with most libraries, and in cases where they do have big shortcomings you can often find a good 3rd party tool. The documentation is a pain to browse and search, but gotapi.com makes some of this pain go away.


Summary

Is Erlang perfect? Certainly not. But sometimes people exaggerate Erlang's problems or they don't address the full picture.

Here are some suggestions I have for improving Erlang:

- Add a Recless-like functionality to make working with records less painful.
- Improve the online documentation by making it easier to browse and search.
- Make some of the string APIs (especially the regexp library) also work with binaries and iolists.
- Add support for overloading macros, just like functions.
- Add support for Smerl-style function inheritance between modules.

Like any language, Erlang has some warts. But if it were perfect, it would be boring, wouldn't it? :)
Author: "Yariv (noreply@blogger.com)" Tags: "erlang, programming"
Send by mail Print  Save  Delicious 
Date: Tuesday, 19 Oct 2010 23:29
With the recent brouhaha over Twitter's scalability problems, I thought, wouldn't it be fun to write a Twitter clone in Erlang?

Last weekend was cold and rainy here in Palo Alto, so I sat down and hacked one, and thus Twoorl was born. It took me one full day plus a couple of evenings. The codebase is about 1700 lines (including comments). You can get it at http://code.google.com/p/twoorl

twoorl_screenshot.png

Note: you need the trunk version of ErlyWeb to make it work (when released, it will be the 0.7.1 version).

Many people written about Twitter's scalability problems and how to solve them. Some have blamed Rails (TechCrunch is among them), whereas others, including Blaine Cook, Twitter's Architect, have convincingly argued that you can scale a webapp written in any language/framework if you've figured out how to Just Add More Servers to handle the growing traffic. Eran Hammer-Lahav wrote some of the most insightful articles on the subject, On Scaling a Microblogging Service.

I have no idea why Twitter is having a hard time scaling. Well, I have some suspicions, but since I haven't been in the Twitter trenches, such speculation isn't worth wasting many pixels on.

I didn't write a Twitter clone in Erlang because I thought my implementation would be inherently more scalable than a Rails one (although it may be cheaper to scale because Erlang has very good performance) . In fact, Twoorl right now wouldn't scale well at all since I prioritized simplicity above all else.

The reasons I wrote Twoorl are:

- ErlyWeb needs more open source apps showing how to use the framework. It's hard to pick how to use the framework just from the API docs.
- Twitter is awesome. Once you start using it, it becomes addictive. I thought it would be fun to write my own.
- Twitter is very popular, but I don't know of any open source clones. I figured somebody may actually want one!
- Some people think Erlang isn't a good language for building webapps. I like to prove them wrong :)
- Although you can scale pretty much anything, your choice of language can make a difference in of performance and stability, both of which lead to happy users.
- I think Erlang is a great language for writing a Twitter clone because Twitter's functionality offers interesting opportunities benefit from concurrency. Here are a couple of ideas I thought of:

1) If you use sharding, the Tweets for different users would be stored in separated databases. When you render the page for someone's timeline, wouldn't it be advantageous to fetch the tweets for all the users she follows in parallel? In Ruby, you would probably do something like this:


def get_tweets(users)
var alltweets = Array.new()
users.each { | user |
alltweets.add(user.fetch_tweets())
}
alltweets.sort()
return alltweets
end


(Please forgive any language errors -- my Ruby is very rusty. Treat the above as Pseudo code.).

This code would work well enough for a small number of tweet streams, but as the number gets large, it would take a very long time to execute.

In ErlyWeb, you could instead do the following:


get_tweets(Users) ->
sort(flatten(pmap(fun(Usr) -> Usr:tweets() end, Users)))


This would spawn a process for each user the user follows, fetch the tweets for that user, then reassemble them in sorted order in the original process before rendering the page. (Think of it as map/reduce implemented directly in the application controller.) If a user follows hundreds of other users, querying their tweets in parallel can significantly reduce page rendering time.

2) Background tasks. When a user sends a tweet, the first thing you want to do is store it in the database. Then, depending on the features, you have to do a bunch of other stuff: send IM/SMS notifications, update RSS feeds, expire caches, etc. Why not do those tasks in different background processes? After to write to the DB, you can return an immediate reply to the user, giving him or her the perception of speed, and then let the background processes do all the extra work for processing the tweet.

(Such technique works very well for Facebook apps, by the way. In Vimagi, when the user submits a painting, the app first saves the painting data, and then it spawns a new process to update the news feed and profile box, send notifications, etc.)

Anyway, I hope you enjoy Twoorl. It's still in very early alpha. It doesn't have many features and it probably has bugs. Please take Twoorl for a spin and give me your feedback! I'll also appreciate useful contributions :)
Author: "Yariv (noreply@blogger.com)" Tags: "erlang, programming, twoorl, erlyweb"
Send by mail Print  Save  Delicious 
Date: Tuesday, 19 Oct 2010 23:29
Nick Gerakines wrote a good tutorial on how to make tag clouds in ErlyWeb. Check it out at http://blog.socklabs.com/2008/04/tag_clouds_in_erlang_with_erly/.
Author: "Yariv (noreply@blogger.com)" Tags: "erlyweb"
Send by mail Print  Save  Delicious 
Date: Tuesday, 19 Oct 2010 23:29
With elections around the corner and politics everywhere these days, I couldn't resist throwing some Erlang into the mix. Don't worry, I'm not about to start discussing my political opinions on this blog. You can find plenty of political commentary on other sites. Instead, I decided to create a sub-Vimagi for political cartoons: Presidential Vimagi.

On Presidential Vimagi, you don't write or talk about your political views. You paint them. If a picture is worth a thousand words, a painting is worth, well, you do the math.

To give you inspiration, I assembled an all star cast of American presidents, vice presidents and presidential candidates: George Bush, Dick Cheney, Barak Obama, Hillary Clinton, Mike Huckabee, John McCain, Ron Paul, Bill Clinton and Al Gore. They each have a profile and you can paint on their profile's v.boards. You can also become friends with them if you want to show your support.

Presidential Vimagi is still a bit rough around the edges, so please let me know if you find any bugs or issues. Feature suggestions are also welcome, of course.

Update: I decided to take the site down for now. It's too prone to vandalism and I don't have the time to constantly monitor it to remove the offensive stuff. It was a fun experiment while it lasted. Oh well :)
Author: "Yariv (noreply@blogger.com)" Tags: "vimagi, presidential"
Send by mail Print  Save  Delicious 
Date: Tuesday, 19 Oct 2010 23:29
I just hacked together an (alpha) implementation of the FriendFeed API for Erlang. You can get the code at http://code.google.com/p/erlang-friendfeed/. Enjoy!

By the way, my shiny new FriendFeed is at http://friendfeed.com/yariv.
Author: "Yariv (noreply@blogger.com)" Tags: "erlang, friendfeed"
Send by mail Print  Save  Delicious 
Date: Tuesday, 19 Oct 2010 23:29
Erlang has a Lisp syntax! Robert Virding just released LFE, his Lisp syntax compiler for Erlang. Finally, Erlang hackers enjoy the full power Lisp-style macros. I suspect it won't be long before you'll be able to hack ErlyWeb apps in LFE :)

While you're here, please answer this Wufoo poll:


Powered by Wufoo
Author: "Yariv (noreply@blogger.com)" Tags: "lisp, erlang, internet"
Send by mail Print  Save  Delicious 
Date: Tuesday, 19 Oct 2010 23:29
I attended the Bay Area Erlang Factory last week. It was a great event. I met many Erlang hackers, attended interesting talks, learned about cool projects (CouchDB, QuickCheck, Nitrogen, Facebook Chat), gave a talk about ErlyWeb, and drank beer (without beer, it wouldn't be a true Erlang meetup).

My favorite talk was by Damien Katz. He told the story of how he had decided to take a risk, quit his job, and work on his then amorphous project. He wanted to work on cool stuff, and that was the only way he could do it. Even if nothing else came out of it, he knew it would have been a great learning exercise. Something great did eventually come out of it, as he created CouchDB (which looks awesome btw) and IBM eventually hired him to work on it full time.

Damiens' story reminded me of the time I started working ErlyWeb a few years ago. After I left the company I was working for at the time, I decided to take a few months and work on something cool. I didn't know what exactly it would be or how long it would take, but I knew that I wanted to build a product that would help people communicate in new ways, and I wanted to build it with my favorite tools. I knew the chance of failure was high, but I figured the learning alone would be worth it. I also viewed open source as an insurance policy of sorts. Even if I couldn't get a product off the ground, my code could live on and continue to provide value to people.

Doing it paid off. My savings dwindled, but I learned Erlang, created ErlyWeb and Vimagi, met many like minded people, and it opened new doors. Now I work on cool stuff at Facebook, ErlyWeb lives on, and every day people are using Vimagi to create amazing art and share it with their friends.

The moral of the story: if you're not working on cool stuff, take a risk and try to make it happen. Don't worry about building the next Google or making lots of money, because you'll probably fail. But the lessons you learn and the connections you make will be worth it.
Author: "Yariv (noreply@blogger.com)"
Send by mail Print  Save  Delicious 
Date: Tuesday, 19 Oct 2010 23:29
You may have noticed I put SnapTalent (http://snaptalent) ads on my blog. This is the first time I have put ads on my blog. I've avoided them because ads from most ad networks are usually irritating/ugly, but the SnapTalent ones are nice looking, unobtrusive and are relevant to my blog's readers because they advertise hacker jobs at startups. I can't say those ads have generated substantial revenue for me thus far but at least I've made enough money to buy the SnapTalent guys a round of beers :)
Author: "Yariv (noreply@blogger.com)"
Send by mail Print  Save  Delicious 
Date: Tuesday, 19 Oct 2010 23:29
I just caught Amazon's announcement of the new persistent storage engine for EC2. This is great stuff. It lets you create persistent block level storage devices ranging from 1GB to 1TB in size and attach them to EC2 instances in predetermined availability zones. This service complements Amazon's other storage services -- EC2 and SimpleDB -- in providing raw block-level storage devices that are persistent, fast and local (so you don't have to worry about SimpleDB's eventual consistency issues). You can use these volumes for anything -- running a traditional DBMS (MySQL, Postgres) is the first thing that comes to mind.

This announcement is a departure from Amazon's tradition of announcing services only once they become available. It looks like Amazon is feeling the heat of competition from Google App Engine and is becoming more open to win over the hearts and minds of developers who are drawn to GAE for its auto-magical scalability. The ability to attach multiple terabyte-sized volumes on demand alleviates some of those concerns when deploying on Amazon's infrastructure. I'm sure it won't be long before someone creates an open source BigTable-like solution for applications that need massive scalability and redundancy on top of multiple persistent storage volumes (I think this would be a great application to write in Erlang, but I don't know how well Erlang performs in applications that require heavy disc IO).

I like what Amazon is doing. By providing the basic building blocks for scalable applications, its enables startups to create their own GAE competitors (Heroku is the first one that comes to mind) on top of Amazon's infrastructure. Smart move.

Google has the advantage of being able to provide APIs for tight integration with other Google services such as authentication and search (the latter is hypothetical as of now). We'll see how strongly this plays in Google's favor in the coming months.

Of course, price is still a question mark. Neither Amazon's persistent storage service nor GAE have had their prices announced.

Another missing detail is the Amazon store service's reliability. If a disc fails, do you lose your data? What's the failure probability? Etc.

All this is great for developers. Competition between Amazon and Google means developers will enjoy more services and for lower prices in the coming years.
Author: "Yariv (noreply@blogger.com)"
Send by mail Print  Save  Delicious 
Date: Tuesday, 19 Oct 2010 23:29
This is a good one I fished from the mailing list:


Using Erlang continually makes me both smile and cry at the same time. I smile because of the overall simplicity it brings to solving all those hard issues I mentioned above, but I also cry knowing how many hours, days, weeks, and months my former colleagues and I spent trying
to solve all those really hard issues.


http://www.nabble.com/Steve-Vinosky-interview-to15709698.html#a15720750
Author: "Yariv (noreply@blogger.com)" Tags: "erlang, programming"
Send by mail Print  Save  Delicious 
Date: Tuesday, 19 Oct 2010 23:29
In my time wasting activities on geeky social news sites, I've been seeing more and more articles about Scala. The main reasons I became interested in Scala are 1) Scala is an OO/FP hybrid, and I think that any attempt to introduce more FP concepts into the OO world is a good thing and 2) Scala's Actors library is heavily influenced by Erlang, and Scala is sometimes mentioned in the same context as Erlang as a great language for building scalable concurrent applications.

A few times, I've seen the following take on the relative mertis of Scala and Erlang: Erlang is great for concurrent programming and it has a great track record in its niche, but it's unlikely to become mainstream because it's foreign and it doesn't have as many libraries as Java. Scala, on the hand, has the best of both worlds. Its has functional semantics, its Actors library provides Erlang style concurrency, and it runs on the JVM and it has access to all the Java libraries. This combination makes Scala it a better choice for building concurrent applications, especially for companies that are invested in Java.

I haven't coded in Scala, but I did a good amount of research on it and it looks like a great language. Some of the best programmers I know rave about it. I think that Scala can be a great replacement for Java. Function objects, type inference, mixins and pattern matching are all great language features that Scala has and that are sorely missing from Java.

Although I believe Scala is a great language that is clearly superior to Java, Scala doesn't supersede Erlang as my language of choice for building high-availability, low latency, massively concurrent applications. Scala's Actors library is a big improvement over what Java has to offer in terms of concurrency, but it doesn't provide all the benefits of Erlang-style concurrency that make Erlang such a great tool for the job. I did a good amount of research into the matter and these are the important differences I think one should consider when choosing between Scala and Erlang. (If I missed something or got something wrong, please let me know. I don't profess to be a Scala expert by any means.)

Concurrent programming


Scala's Actor library does a good job at emulating Erlang style message passing. Similar to Erlang processes, Scala actors send and receive messages through mailboxes. Like Erlang, Scala has pattern matching sematics for receiving messages, which results in elegant, concise code (although I think Erlang's simpler type system makes pattern matching easier in Erlang).

Scala's Actors library goes pretty far, but it doesn't (well, it can't) provide an important feature that makes concurrent programming so easy in Erlang: immutability. In Erlang, multiple processes can share the same data within the same VM, and the language guarantees that race conditions won't happen because this data is immutable. In Scala, though, you can send between actors pointers to mutable objects. This is the classic recipe for race conditions, and it leaves you just where you started: having to ensure synchronized access to shared memory.

If you're careful, you may be able to avoid this problem by copying all messages or by treating all sent objects as immutable, but the Scala language doesn't guarantee safe access to shared objects. Erlang does.

Hot code swapping


Hot code swapping it a killer feature. Not only does it (mostly) eliminates the downtime required to do code upgrades, it also makes a language much more productive because it allows for true interactive programming. With hot code swapping, you can immediately test the effects of code changes without stopping your server, recompiling your code, restarting your server (and losing the application's state), and going back to where you had been before the code change. Hot code swapping is one of the main reasons I like coding in Erlang.

The JVM has limited support for hot code swapping during development -- I believe it only lets you change a method's body at runtime (an improvement for this feature is in Sun's top 25 RFE's for Java). This capability is not as robust as Erlang's hot code swapping, which works for any code modification at any time.

A great aspect of Erlang's hot code swapping is that when you load new code, the VM keeps around the previous version of the code. This gives running processes an opportunity to receive a message to perform a code swap before the old version of the code is finally removed (which kills processes that didn't perform a code upgrade). This feature is unique to Erlang as far as I know.

Hot code swapping is even more important for real-time applications that enable synchronous communications between users. Restarting such servers would cause user sessions to disconnect, which would lead to poor user experience. Imagine playing World of Warcraft and, in the middle of a major battle, losing your connection because the developers wanted to add a log line somewhere in the code. It would be pretty upsetting.

Garbage collection


A common argument against GC'd languages is that they are unsuitable for low latency applications due to potential long GC sweeps that freeze the VM. Modern GC optimizations such as generational collection alleviate the problem somewhat, but not entirely. Occasionally, the old generation needs to be collected, which can trigger long sweeps.

Erlang was designed for building applications that have (soft) real-time performance, and Erlang's garbage collection is optimized for this end. In Erlang, processes have separate heaps that are GC'd separately, which minimizes the time a process could freeze for garbage collection. Erlang also has ets, an in-memory storage facility for storing large amounts of data without any garbage collection (you can find more information on Erlang GC at http://prog21.dadgum.com/16.html).

Erlang might not have a decisive advantage here. The JVM has a new concurrent garbage collector designed to minimize freeze times. This article and this whitepaper (PDF warning) have some information about how it works. This collector trades performance and memory overhead for shorter freezes. I haven't found any benchmarks that show how well it works in production apps, though, and if it is as effective as Erlang's garbage collector for low-latency apps.

Scheduling


The Erlang VM schedules processes preemptively. Each process gets a certain number of reductions (roughly equivalent to function calls) before it's swapped out for another process. Erlang processes can't call blocking operations that freeze the scheduler for long periods. All file IO and communications with native libraries are done in separate OS threads (communications are done using ports). Similar to Erlang's per-process heaps, this design ensures that Erlang's lightweight processes can't block each other. The downside is some communications overhead due to data copying, but it's a worthwhile tradeoff.

Scala has two types of Actors: thread-based and event based. Thread based actors execute in heavyweight OS threads. They never block each other, but they don't scale to more than a few thousand actors per VM. Event-based actors are simple objects. They are very lightweight, and, like Erlang processes, you can spawn millions of them on a modern machine. The difference with Erlang processes is that within each OS thread, event based actors execute sequentially without preemptive scheduling. This makes it possible for an event-based actor to block its OS thread for a long period of time (perhaps indefinitely).

According to the Scala actors paper, the actors library also implements a unified model, by which event-based actors are executed in a thread pool, which the library automatically resizes if all threads are blocked due to long-running operations. This is pretty much the best you can do without runtime support, but it's not as robust as the Erlang implementation, which guarantees low latency and fair use of resources. In a degenerate case, all actors would call blocking operations, which would increase the native thread pool size to the point where it can't grow anymore beyond a few thousand threads.

This can't happen in Erlang. Erlang only allocates a fixed number of OS threads (typically, one per processor core). Idle processes don't impose any overhead on the scheduler. In addition, spawning Erlang processes is always a very cheap operation that happens very fast. I don't think the same applies to Scala when all existing threads are blocked, because this condition first needs to be detected, and then new OS threads need to be spawned to execute pending Actors. This can add significant latency (this is admittedly theoretical: only benchmarks can show the real impact).

Depends on what you're doing, the difference between process scheduling in Erlang and Scala may not impact performance much. However, I personally like knowing with certainty that the Erlang scheduler can gracefully handle pretty much anything I throw at it.

Distributed programming


One of Erlang's greatest strengths is that it unifies concurrent and distributed programming. Erlang lets you send a message to a process in the local or on a remote VM using exactly the same semantics (this is sometimes referred to as "location transparency"). Furthermore, Erlang's process spawning and linking/monitoring works seamlessly across nodes. This takes much of the pain out of building distributed, fault-tolerant applications.

The Scala Actors library has a RemoteActor type that apparently provides the similar location-transparency, but I haven't been able to find much information about it. According to this article, it's also possible to distribute Scala actors using Terracotta, which does distributed memory voodoo between nodes in a JVM cluster, but I'm not sure how well it works or how simple it is to set up. In Erlang, everything works out of the box, and it's so simple to get it working it's in the language's Getting Started manual.

Mnesia


Lightweight concurrency with no shared memory and pure message passing semantics is a fantastic toolset for building concurrent applications... until you realize you need shared (transactional) memory. Imagine building a WoW server, where characters can buy and sell items between each other. This would be very hard to build without a transactional DBMS of sorts. This is exactly what Mnesia provides -- with the a number of extra benefits such as distributed storage, table fragmentation, no impedance mismatch, no GC overhead (due to ets), hot updates, live backups, and multiple disc/memory storage options (you can read the Mnesia docs for more info). I don't think Scala/Java has anything quite like Mnesia, so if you use Scala you have to find some alternative. You would probably have to use an external DBMS such as MySQL cluster, which may incur a higher overhead than a native solution that runs in the same VM.

Tail recursion


Functional programming and recursion go hand-in-hand. In fact, you could hardly write working Erlang programs without tail recursion because Erlang doesn't have loops -- it uses recursion for *everything* (which I believe is a good thing :) ). Tail recursion serves for more than just style -- it's also facilitates hot code swapping. Erlang gen_servers call their loop() function recursively between calls to 'receive'. When a gen_server receive a code_change message, they can make it a remote call (e.g. Module:loop()) to re-enter its main loop with the new code. Without tail recursion, this style of programming would quickly result in stack overflows.

From my research, I learned that Scala has limited support for tail recursion due to bytecode restrictions in most JVMs. From http://www.scala-lang.org/docu/files/ScalaByExample.pdf:


In principle, tail calls can always re-use the stack frame of the calling function. However, some run-time environments (such as the Java VM) lack the primitives to make stack frame re-use for tail calls efficient. A production quality Scala implementation is therefore only required to re-use the stack frame of a directly tail-recursive function whose last action is a call to itself. Other tail calls might be optimized also, but one should not rely on this across implementations.


(If I understand the limitation correctly, tail call optimization in Scala only works within the same function (i.e. x() can make a tail recursive call to x(), but if x() calls y(), y() couldn't make a tail recursive call back to x().)

In Erlang, tail recursion Just Works.

Network IO


Erlang processes are tightly integrated with the Erlang VM's event-driven network IO core. Processes can "own" sockets and send and receive messages to/from sockets. This provides the elegance of concurrency-oriented programming plus the scalability of event-driven IO (the Erlang VM uses epoll/kqueue under the covers). From Googling around, I haven't found similar capabilities in Scala actors, although they may exist.

Remote shell


In Erlang, you can get a remote shell into any running VM. This allows you to analyzing the state of the VM at runtime. For example, you can check how many processes are running, how much memory they consume, what data is stored Mnesia, etc.

The remote shell is also a powerful tool for discovering bugs in your code. When the server is in a bad state, you don't always have to try to reproduce the bug offline somehow to devise a fix. You can log right into it and see what's wrong. If it's not obvious, you can make quick code changes to add more logging and then revert them when you've discovered the problem. I haven't found a similar feature in Scala/Java from some Googling. It probably wouldn't be too hard to implement a remote shell for Scala, but without hot code swapping it would be much less useful.

Simplicity


Scala runs on the JVM, it can easily call any Java library, and it is therefore closer than Erlang to many programmers' comfort zones. However, I think that Erlang is very easy to learn -- definitely easier than Scala, which contains a greater total number of concepts you need to know in order to use the language effectively (especially if you consider the Java foundations on which Scala is built). This is to a large degree due to Erlang's dynamic typing and lack of object orientation. I personally prefer Erlang's more minimalist style, but this is a subjective matter and I don't want to get into religious debates here :)

Libraries


Java indeed has a lot of libraries -- many more than Erlang. However, this doesn't mean that Erlang has no batteries included. In fact, Erlang's libraries are quite sufficient for many applications (you'll have to decide for yourself if they are sufficient for you). If you really need to use a Java library that doesn't have an Erlang equivalent, you could call it using Jinterface. It may or may not be a suitable option for your application. This can indeed be a deal breaker for some people who are deciding between the two languages.

There's an important difference between Java/Scala and Erlang libraries besides their relative abundance: virtually all "big" Erlang libraries use Erlang's features concurrency and fault tolerance. In the Erlang ecosystem, you can get web servers, database connection pools, XMPP servers, database servers, all of which use Erlang's lightweight concurrency, fault tolerance, etc. Most of Scala's libraries, on the other hand, are written in Java and they don't use Scala actors. It will take Scala some time to catch up to Erlang in the availability of libraries based on Actors.

Reliability and scalability


Erlang has been running massive systems for 20 years. Erlang-powered phone switches have been running with nine nines availability -- only 31ms downtime per year. Erlang also scales. From telcom apps to Facebook Chat we have enough evidence that Erlang works as advertised. Scala on the other hand is a relatively new language and as far as I know its actors implementation hasn't been tested in large-scale real-time systems.

Conclusion


I hope I did justice to Scala and Erlang in this comparison (which, by the way, took me way too much to write!). Regardless of these differences, though, I think that Scala has a good chance of being the more popular language of the two. Steve Yegge explains it better than I can:


Scala might have a chance. There's a guy giving a talk right down the hall about it, the inventor of – one of the inventors of Scala. And I think it's a great language and I wish him all the success in the world. Because it would be nice to have, you know, it would be nice to have that as an alternative to Java.

But when you're out in the industry, you can't. You get lynched for trying to use a language that the other engineers don't know. Trust me. I've tried it. I don't know how many of you guys here have actually been out in the industry, but I was talking about this with my intern. I was, and I think you [(point to audience member)] said this in the beginning: this is 80% politics and 20% technology, right? You know.

And [my intern] is, like, "well I understand the argument" and I'm like "No, no, no! You've never been in a company where there's an engineer with a Computer Science degree and ten years of experience, an architect, who's in your face screaming at you, with spittle flying on you, because you suggested using, you know... D. Or Haskell. Or Lisp, or Erlang, or take your pick."


Well, at least I'm not trying too hard to promote LFE... :)
Author: "Yariv (noreply@blogger.com)" Tags: "erlang, programming, scala"
Send by mail Print  Save  Delicious 
Next page
» You can also retrieve older items : Read
» © All content and copyrights belong to their respective authors.«
» © FeedShow - Online RSS Feeds Reader