博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Graham King » In-memory key-value store in C, Go and Python
阅读量:6069 次
发布时间:2019-06-20

本文共 5167 字,大约阅读时间需要 17 分钟。

excellent work,need cost one week to particese the code in the article

March 13, 2012

Posted in at 06:00 by graham

On paternity leave for my second child, I found myself writing an in-memory hashmap (a poor-man’s memcached), in Go, Python and C. I was wondering how hard it would be to replace memcached if we wanted to do something unusual with our key-value store. I also wanted to compare my knowledge of the languages, and, well, I get bored easily!

The code is on .

Each version implements enough of the get and set commands from the that we can test them with a memcached client.

The wonderful Internet

Thanks to a great discussion , in the comments here, and , I have heavily updated this post. You’re reading the new improved version.

Having the Internet review your code is humbling. It is also an amazing way to learn. The code, and my knowledge, have improved tremendously thanks to the feedback.

 

So far wonderful folks have contributed versions in , , , , , and . If you write a version in a different language (see Spec at bottom), please send it to me, either directly or as a pull-request on github. I’ll add it to the repo.

Networking

Respond to each request with a single write. Or switch TCP_NODELAY on.

I initially thought the Go version was many times faster than the C and Python versions, but that was due to quirks in my implementation.

The C and Python versions were doing multiple small writes to respond to each “get” request. On the server side was buffering those small writes, waiting for an ACK from the client of it’s previous send. On the client side, meant the client wasn’t sending the needed acknowledgment until a timeout. A textbook case, assuming you know about that textbook.

The solution was to combine those multiple writes into a single one. An alternative solution would be to switch on .

The Go version didn’t require me to know about TCP_NODELAY, which I appreciated.

Internally, Go sets the socket to be non-blocking and uses and to wait for data. Memcached does the same thing, via . I only used a single client, and the implementations are not thread-safe, so these don’t matter right now. They should matter with a large number of clients.

Writing it: Layers over Unix

Sockets: Our languages are wrapping POSIX system calls. C maps the calls exactly, and Python maps the C calls very closely. Go does a bit more work for us, replacing socket-bind-listen-accept with just listen-accept.

Multiple concurrent requests: Go has the magic statement, Python has the library, and C has . None of my implementations are thread safe (thanks ).

Read buffering: Managing the data you get from the socket was easy, once commenters taught me about C’s fdopen. Sockets are just streams of data, it’s up to the protocol to say when a message starts and ends. For the memcached protocol, we need to be able to read until \r\n, and also read a specific amount of bytes (which might include \r\n).

Go has the package. C lets you treat a socket as a file, thanks to . Python does the same thing with , which presumably wraps fdopen.

My C is rusty, so that version took me the longest to write, and is the least legible. The biggest slowdown was because I originally implemented buffering around the socket ‘recv’ call. Commenters (thanks kbob) pointed out I didn’t need to do that, and I replaced it.

The Python version took me a while to get right, and that’s mostly my fault. I knew Python wouldn’t make me write my own buffering code, and I spent too long playing with timeout and non-blocking setups. Once I realized I could treat the socket as a file (‘makefile’) things got easier.

Python 3′s universal newline support kindly normalizes \r\n to \n, but we’re disabling it here to preserve python 2 compatibility.

The Go version was the easiest to write because I had already used the bufio package, which solves our stream buffering problem.

Read on for how to build and profile the different versions yourself.


Build

  • Go version: You’ll need a recent weekly snapshot of Go (hg update weekly), or Go v1 if it is released by the time you read this. Run it with: go run memg.go, or compile it with go build memg.go and run the executable it makes.

  • Python version: Uses Python 2.7 or Python 3, just run the script.

  • C version: Build with gcc -std=gnu99 -g -W -Wall -pedantic -pthread -o memg memg.c. There should be no warnings.

Timing

This is explicitly not a benchmark, except if the question is “what can an average programmer do in a few hours?”.

All the versions perform about the same for me, and any differences are probably more due to my implementation than anything inherent in the languages. As said in the Hacker News comments:

In order to perform such a comparison you can’t write three small throw-away programs, you need to optimize each version at your best for weeks to start to be meaningful, and you need an expert in the three systems.

To time the different versions on your machine run: time python test.py. That script requires python 2 and pylibmc (pip install pylibmc or sudo apt-get install python-pylibmc).

Try test.py against memcached too.

Profiling

To make profiling easier, each version accepts a --single c

转载地址:http://bkfgx.baihongyu.com/

你可能感兴趣的文章
查看linux版本号的几种方法
查看>>
[转]MyBatis传入多个参数的问题 - mingyue1818
查看>>
Meteor 加入账户系统
查看>>
iOS可持续化集成: Jenkins + bundler + cocoapods + shenzhen + fastlane + pgyer
查看>>
计算几位学生的平均分
查看>>
Python黑客编程2 入门demo--zip暴力破解
查看>>
必看 :大数据挖掘中易犯的11大错误
查看>>
宿主系统为Ubuntu 14,CentOS 6.5 安装VirtualBox增强工具失败:Building the OpenGL support module[FAILED]...
查看>>
MVC学习系列14--Bundling And Minification【捆绑和压缩】--翻译国外大牛的文章
查看>>
Android实战简易教程-第十枪(画廊组件Gallery有用研究)
查看>>
POJ 2965:The Pilots Brothers' refrigerator
查看>>
Principle of Computing (Python)学习笔记(7) DFS Search + Tic Tac Toe use MiniMax Stratedy
查看>>
无法启动此程序,因为计算机中丢失 api-ms-win-crt-stdio-l1-1-0.dll 解决
查看>>
java获取指定文件夹下的所有文件名
查看>>
weex 项目开发(一) weex create project 与 weex init project 的区别
查看>>
PCH简单介绍
查看>>
c#实现用SQL池(多线程),定时批量执行SQL语句
查看>>
【译】Immutable.js: Map - 5
查看>>
【移动端 Web】怎么循序渐进地开发一个移动端页面
查看>>
Python把同一个对象循环赋值给另外一个变量
查看>>