User:Lukasblakk/PyCon2010
Contents
General Notes
- PyCon 2010 - Atlanta, GA - Feb 16-25th, 2010
- Shuttleworth keynote: have a dedicated reviewer each day, patches can't be more than 800 lines, but will get reviewed and committed if you follow these rules promise upstreamers that if they maintain their stable well you will commit to downstream stable, keep users in the loop. the future of open source as desktop is getting design professionals involved - do lots of "shut up and watch" testing where people try stuff out in front of you and you see what you need to go back and fix. don't help in the room, help with the code.
- Hudson as a continuous integration system. The sprint I did with the OpenGov folks was using Hudson and so is SQLAlchemy, http://fiftystates-dev.sunlightlabs.com/hudson/builds for example. I found it really hard to see why someone would prefer Hudson over buildbot. I can't tell if it's just me being used to the look n feel of buildbot, but I find that Husdon looks flimsy in some way. I'd like to know more about it and see if it's something that is worth looking into though because people did have really good things to say about it and a lot of folks were panning how tricky buildbot is to configure and work with...as we most certainly know. http://hudson-ci.org/
- Thoughts on Sprints - It turned out that I didn't do a Mozilla-specific sprint and instead worked with other teams on their projects. I feel like attending this conference was very useful in taking the temperature of lots of other open source communities, projects, workplaces, and workers. I have been soaking up information on how other projects do their work and have also had some chats with people about how they are building and releasing their software. For next year I would want to:
- Have our build scripts/buildbotcustom customizations in a python module that is able to be worked on outside of releng
- Do a lot more planning ahead of time for what kind of contributions we could use and how we would organize the sprint time
- Do more outreach at PyCon - lightning talk at least, if not a full 30-40 minute talk
- I think we should have a bigger presence at PyCon. From what I gather, this is a community of people dedicated to the language and all it can do - yet they are often stifled in their work and tend to work on Python projects on the side. We work with Python, we have a commitment to open source that is seemingly unparalleled and we do BIG things with buildbot. People are a) really interested in what we do (repeat note about doing a talk next year, I think if we showed what we have pushed buildbot to do, folks would be really interested to talk about it with us and probably have some good ideas) and b) love community and might see Mozilla's community as a nice side-step from the Python community. I heard many times how open and welcoming the Python community is, I think that Mozilla is too - and want to bring those together.
Tutorials
Wednesday
Python 101
This was actually really useful to me because in my time at Mozilla I've worked on Python code that already exists and figured things out but didn't have a clear picture for why Python works like it does. I really enjoyed learning about list comprehensions and generators and look forward to getting a chance to write more Python using some this new understanding.
Python 102
See Python 101, more of the same. Really enjoyed the tutorials.
Thursday
Intro to Twisted
This one was not so useful and I didn't really come away with any new insight into Twisted.
py.test
Interesting at first but went over my head quickly. It's obvious that we need to test our code more, and this might be a good way to do it but I would need to review the handouts and play with py.test a lot more to really get it.
Talks attended
Friday
(the -1/0/+1 rating before the title is my subjective rating of the talk)
- (+1) The Mighty Dictionary
- Learned how dicts work under the hood
- Now it makes sense that you can't dynamically change the keys, or change values while iterating
- Great diagrams of how the size changes as needed to keep collisions low (under %40 max collisions)
- Average performance is amazing, only a few keys will necessitate lots of lookups
- (+1) Deployment, development, packaging, and a little bit of the cloud by Ian Bicking
- (0) Database scalability
- Feels very relevant to build infra scalability
- LOOKUP "consistent hashing" for db caching keys
- (+1) Deconstructing an Object
- __init__ with super(Class, self).__init__ instead of having to use the super class name. We should refactor our code for this, so less references to the subclass names will make it easier to change things in the future
Saturday
- (+1) Demystifying Non-Blocking and Asynchronous I/O
- Learned the difference between poll() and epoll()
- (+1) Unladen Swallow: Fewer Coconuts, Faster Python
- Google trains more Python engineers than anywhere else
- YouTube is 100% pure Python so it's in their interest to make it faster
- Why is Python slow? - everything is an object, lots of indirection, built-ins (lots of lookups)
- They had a goal to make Python 5x faster, but failed
- So they added a git compiler, assume the program is less dynamic than the language
- even though the language allows you to do all sorts of things, you are generally not doing this (changing a class' inheritance order during runtime)
- Stole gratuitously from other projects, nothing original (Tracemonkey, self, PyPy)
- Most of Unladen Swallow is CPython, detects when your code is 'hot' and does optimization accordingly
- A lot of C code talk in a python talk...
- Top-down inlining - uses len() as an example of how to take the assumption of a constant type (dict) and drastically reduce the amount of code needed between the call to len() and the result returned
- 32 benchmarks and counting!
- Trying to get more external benchmarks for the suite so that it's not all using Google's insular, homogenous code bases
- Speed improvement is more like 1.01-1.84x faster (vs. CPython)
- 36x faster than PyPy on unpickle
- Memory usage needs to go down (git compiler takes a lot) - it's one of the qualifications in the merger proposal
- Looking Back:
- Did first release in Q1 of 2009, picked low hanging fruit in CPython implementation
- Why they didn't hit 5x? Went into it using llvm thinking it would be a pretty solid platform (based on Apple's use for graphics) -- that turned out to be an incorrect assumption - 16mb is not enough
- Right now they just need more hands on the code base
- 5x is easily achievable with the infrastructure that exists now
- GIL - Global Interpreter Lock - they're working on it. Thought they had a solution for dealing with it but it had already been tried/failed.
- (0) Diversity as a Dependency
- Pythonista, co-editor of Python cookbook, senior undergrad at Stanford
- Diversity reasons usually framed as {Moral, Legal, Political, Social Cohesion}
- Usually it's just driven by guilt, great motivator, stops you from engaging in anti-social behaviour. But usually people are either laying on guilt, threatening guilt, or avoiding guilt. This just pisses people off.
- This talk is a guilt-free zone
- Diversity is not a club to beat people with
- WIIFM - why should people care?
- WIIFP (What's in it for Python?)
- Study from the 50's found that people in small town of 500 with subscriptions to newspapers/magazines that brought them news from around the world had the most influence in small town workings
- Another study looked at relationships across 'structural holes' being the most creative and productive at finding solutions.
- Science labs:
- working alone, a scientist explains away unexpected results - working in teams: get together and look at why.
- Found analogies, conceptual changes
- The labs that were really solving the difficult problems had different pools of knowledge to draw upon.
- Universal Design: eg. curb cuts - designed for wheelchairs but benefit everyone
- text-to-speech
- voice control
- Focusing on the kinds of diversity that matter for Python is less about identity itself - look more at needs/wants/motivations
- We all have different skill sets & perspectives to bring to Python (Open Source)
- This is where identity (gender, race, ability) really comes into play
- Our perspectives are affected by this
- However, diversity is hard - we have to be willing to do the work
- (+1) Small acts make great revolutions: crafting Python and Open Source communities in Rio de Janeiro
- Henrique has gotten a great community going in Rio, and he's also experienced and interested in build & release. Will be sending us his resume
- (0) How Python is guiding infrastructure construction in Africa
- Interesting talk, the speaker has taught grad students python so they can do image scanning for houses in remote areas through the various countries in Africa and then do geo-spatial crunching to figure out best-path for electricity to be installed.
- (+1) Think Globally, Hack Locally - Teaching Python in your Community
- Lots of great info and shared experience about teaching/mentoring newer programmers and community members
- (+1) Persistent Graphs in Python with Neo4j
- Neo4j is really interesting and I wonder if it would help graphserver at all since it uses a different model than relational database. Will bring in more info from the presenter's slides.
Sunday
- (0) Modern Version Control: Mercurial internals
- showed a revlog index - rev no. will be different for each local working copy
- DAG - two parents, p1 and p2 in the revlog
- Three tiers of revlogs for each clone: Changelog (hg log), Manifest (hg up), Filelogs (hg log <file>) - per-file revision
- (+1) Hg and Git - can't we all just get along?
- why do we have two systems that are so similar?
- Linus used git right off the bat, two weeks later along comes hg but Linux doesn't like revlog - he wanted a file system type storage mechanism
- Both survived because both Matt and Linus are hardheaded
- both are fully local (freaking fast, clones are backups, work offline)
- both implement histories as directed graphs, makes merging much easier
- if you know one, you largely know the other
- Enemy is not each other - it's SVN :)
Development Sprints
OpenGov
- Worked on http://us.pycon.org/2010/sprints/projects/opengov/
- First day involved getting a bunch of bugs sorted out with hg-git - http://hg-git.github.com/ there's a problem turning git repos into hg clones because it flattens all the branches.
- Working on adding a scraper for Oregon: http://fiftystates-dev.sunlightlabs.com/contributing.html