Planet Python

Last update: February 02, 2016 01:49 AM

February 01, 2016

PyTennessee

PyTN Profiles: Edward Finkler (@funkatron)

Speaker Profile: Edward Finkler (@funkatron)

Ed Finkler, also known as Funkatron, started making web sites before browsers had frames. He does front-end and server-side work in Python, PHP, and JavaScript. He is the Lead Developer and Head of Developer Culture at Graph Story.

He served as web lead and security researcher at The Center for Education and Research in Information Assurance and Security (CERIAS) at Purdue University for 9 years. Along with Chris Hartjes, Ed is co-host of the Development Hell podcast.

Ed’s current passion is raising mental health awareness in the tech community with his Open Sourcing Mental Illness speaking campaign.

Ed writes at funkatron.com.

Ed will be presenting “How To Be A Great Developer” on Sunday at 1PM. Being a great developer is much more than technical know-how. Empathy, communication, and reason are at least as important, but are undervalued in our industry. We’ll examine the impact these skills can have and how to apply them to our work.

February 01, 2016 06:25 PM

Django Weblog

Django releases issued: 1.9.2 (security) and 1.8.9 (bugfix)

In accordance with our security release policy, the Django team is issuing Django 1.9.2. This release addresses a security issue detailed below. We encourage all users of Django to upgrade as soon as possible. The Django master branch is also updated.

Today we've also issued a bugfix release for the 1.8 release series. Details can be found in the release notes for 1.8.9.

CVE-2016-2048: User with "change" but not "add" permission can create objects for `ModelAdmin`’s with `save_as=True`

If a ModelAdmin uses save_as=True (not the default), the admin provides an option when editing objects to "Save as new". A regression in Django 1.9 prevented that form submission from raising a "Permission Denied" error for users without the "add" permission.

Thanks Myk Willis for reporting the issue.

Affected supported versions

Django master development branch
Django 1.9

Django 1.8 is not affected. Per our supported versions policy, Django 1.7 and older are no longer receiving security updates but are also unaffected.

Resolution

Patches have been applied to Django's master development branch and to the 1.9 release branch, which resolve the issue described above. The patches may be obtained directly from the following changesets:

On the development master branch
On the 1.9 release branch

The following new release has been issued:

Django 1.9.2 (download Django 1.9.2 | 1.9.2 checksums)

The PGP key ID used for these releases is Tim Graham: 1E8ABDC773EDE252.

General notes regarding security reporting

As always, we ask that potential security issues be reported via private email to security@djangoproject.com, and not via Django's Trac instance or the django-developers list. Please see our security policies for further information.

February 01, 2016 05:25 PM

Continuum Analytics News

Anaconda for R users: SparkR and rBokeh

Developer Blog

Posted Monday, February 1, 2016

Christine Doig

In this post, we present two projects for the R programming language that are powered by Anaconda. We will explore how rBokeh allows you to create beautiful interactive visualizations and how easy it is to scale your predictive models with SparkR through Anaconda’s cluster management capabilities.

Bokeh and rBokeh

Bokeh is an interactive visualization framework that targets modern web browsers for presentation. Bokeh can help anyone who would like to quickly and easily create interactive plots, dashboards and data applications, without having to learn web technologies, such as JavaScript. Bokeh currently provides interfaces in Python, R, Lua and Scala. rBokeh is the R library that allows you write interactive visualizations in R.

Spark and SparkR

Spark is a popular open source processing framework for large scale in-memory distributed computations. SparkR is an R package that provides a frontend API to use Apache Spark from R.

Getting started with R in Anaconda

Conda, R Essentials and the R channel on Anaconda Cloud.

The easiest way to get started with R in Anaconda is installing R Essentials, a bundle of over 80 of the most used R packages for data science, including dplyr, shiny, ggplot2, tidyr, caret and nnet. R Essentials also includes Jupyter Notebooks and the IRKernel. To learn about conda and Jupyter, visit our previous blog post, "Jupyter and conda for R."

Conda is Anaconda's package, dependency and environment manager. Conda works across platforms (Windows, Linux, OS X) and across languages (R, Python, Scala...). R users can also use conda and benefit from its capabilities: create and install R packages and manage portable sandboxes of environments that might have different packages or versions of packages. An R channel is available on Anaconda Cloud with over 200+ R packages.

To install R Essentials run:

$ conda install -c r r-essentials

To learn more about the benefits of conda, how to create R conda packages, and manage projects with Python and R dependencies, visit our previous blogpost “Conda for data science."

rBokeh: Interactive Data Visualizations in R

rBokeh is included in R Essentials, but it can also be separately installed from the R channel:

$ conda install -c r r-rbokeh

Once you have rBokeh installed, you can start an R console by typing `r` in your terminal:

$(r-essentials):~/$ r

Import the rBokeh library and start creating your interactive visualizations. The following example draws a scatterplot of the iris dataset with different glyphs and marker colors depending on the Species class, and the hover tool indicates their values as you mouse over the data points:

> library(rbokeh)
> p - figure() %>%
  ly_points(Sepal.Length, Sepal.Width, data = iris,
    color = Species, glyph = Species,
    hover = list(Sepal.Length, Sepal.Width))
> p

rBokeh plots include a toolbox with the following functionality: panning, box zooming, resizing, wheel zooming, resetting, saving and tooltip hovering.

rBokeh and Shiny

Besides the interactivity that is offered through the toolbox, rBokeh also integrates nicely with Shiny, allowing you to create visualizations that can be animated.

Here’s an example of a simple Shiny App using rBokeh that generates a new hexbin plot from randomly sampling two normal distributions (x and y).

library("shiny")
library("rbokeh")
library("htmlwidgets")

ui - fluidPage(
  rbokehOutput("rbokeh")
)

server - function(input, output, session) {
  output$rbokeh - renderRbokeh({
    invalidateLater(1500, session)
    figure(plot_width = 400, plot_height = 800) %>% ly_hexbin(rnorm(10000), rnorm(10000))
    })
}

shinyApp(ui, server)

For more information and examples on rBokeh, visit the rBokeh documentation.

Using Anaconda for cluster management

The simplicity that conda brings to package and environment management can be extended to your cluster through Anaconda’s capabilities for cluster management. Anaconda for cluster management is freely available for unlicensed, unsupported use with up to 4 cloud-based or bare-metal cluster nodes.

You can install the cluster management library from the anaconda-cluster channel on Anaconda Cloud. You must have an Anaconda Cloud account and be logged in via anaconda login.

conda install anaconda-client
anaconda login
conda install anaconda-cluster -c anaconda-cluster

For detailed installation instructions, refer to the Installation section in the documentation.

Setting up your cloud cluster

In this example, we will create and provision a 4-node cloud-based cluster on Amazon EC2. After installing the Anaconda cluster management library, run the following command:

$ acluster

This will create the ~/.acluster directory, which contains all of the configuration information. Edit the the ~/.acluster/providers.yaml file and add your AWS credentials and key file information.

aws_east:
  cloud_provider: ec2
  keyname: my-private-key
  location: us-east-1
  private_key: ~/.ssh/my-private-key.pem
  secret_id: AKIAXXXXXX
  secret_key: XXXXXXXXXX

Next, create a profile in the ~/.acluster/profiles.d/ directory that defines the cluster and includes the spark-standalone and notebook plugins.

name: aws_sparkr
node_id: ami-d05e75b8
node_type: m3.large
num_nodes: 4
plugins:
  - spark-standalone
  - notebook
provider: aws_east
user: ubuntu

You can now launch and provision your cluster with a single command:

$ acluster create spark_cluster --profile aws_sparkr

Notebooks and R Essentials on your cluster

After the cluster is created and provisioned with conda, Spark, and a Jupyter Notebook server, we can install R Essentials on all of the cluster nodes with:

$ acluster conda install r-essentials -c r

You can open the Jupyter Notebook server that is running on the head node of your cluster with:

$ acluster open notebook

The default notebook password is acluster, which can be customized in your profile. You can now open an R notebook that is running on the head node of your cloud-based cluster:

For example, here’s a simple notebook that runs on a single node with R Essentials:

The full example notebook can be viewed and downloaded from Anaconda Cloud.

Running SparkR on your cluster

You can open the Spark UI in your browser with the following command:

$ acluster open spark-standalone

Now, let’s create a notebook that uses SparkR to distribute a predictive model across your 4-node cluster. Start a new R notebook and execute the following lines:

Sys.setenv(SPARK_HOME='/opt/anaconda/share/spark')
.libPaths(c(file.path(Sys.getenv('SPARK_HOME'), 'R', 'lib'), .libPaths()))

library(SparkR)
sc - sparkR.init("spark://{{ URL }}:7077", sparkEnvir = list(spark.r.command='/opt/anaconda/bin/Rscript'))

Replace {{ URL }} in the above command with the URL displayed in the Spark UI. In my case, the URL is spark://ip-172-31-56-255:7077.

The following notebook uses three Spark workers across the cluster to fit a predictive model.

You can view the running Spark applications in the Spark UI:

and you can verify that the three Spark workers are running the application:

You can view and download the SparkR notebook from Anaconda Cloud.

Once you’ve finished, you can easily destroy the cluster with:

$ acluster destroy spark_cluster

Conclusion

We have presented two useful projects for doing Data Science in R with Anaconda: rBokeh and SparkR. Learn more about these projects in their respective documentation pages: rBokeh and SparkR. I also recommend downloading the Anaconda for cluster management cheat sheet to help you setup, manage, and provision your clusters.

Thanks to Ryan Hafen, rBokeh developer; and all of the Spark and SparkR developers.

For more information about scaling up Python and R in your enterprise, or Anaconda's cluster management features, contact sales@continuum.io.

February 01, 2016 05:18 PM

Peter Bengtsson

Bestest and securest way to handle Python dependencies

pip 8 is out and with it, the ability to only install dependencies you've vetted. Thank Erik Rose! Now you can be absolutely certain that dependencies you downloaded and installed locally is absolutely identical to the dependencies you download and install in your production server.

First `pipstrap.py`

So your server needs pip to install those dependencies safely and securely. Initially you have to trust the pip/virtualenv that is installed globally on the system. If you can trust it but unsure it's a good version of pip version 8 and up, that's where pipstrap.py comes in. It makes sure you get a pip version installed that supports pip install with hashes:

Add pipstrap.py to your git/hg repo and use it to make sure you have a good pip. For example your deployment script might look like this now:

#!/bin/bash
git pull origin master
virtualenv venv
source venv/bin/activate
python ./tools/pipstrap.py
pip install --require-hashes -r requirements.txt

Then `hashin`

Thanks to pipstrap we now have a version of pip that really does check the hashes you've put in the requirements.txt file.

(By the way, the --require-hashes on pip install is optional. pip will imply it if the requirements.txt file appears to have hashes defined. But to avoid the risk and you accidentally fumbling a bad requirements.txt it's good to specify --require-hases to pip install)

Now that you're up and running and you sleep well at night because you know your production server has exactly the same dependencies you had when you did the development and unit testing, how do you get the hashes in there?

The tricks is to install hashin. (pip install hashin). It helps you write those hashes.

Suppose you have a requirements.txt file that looks like this:

Django==1.9.1
bgg==0.22.1
html2text==2016.1.8

You can try to run pip install --require-hashes -r requirements.txt and learn from the errors. E.g.:

Hashes are required in --require-hashes mode, but they are missing from some requirements. 
Here is a list of those requirements along with the hashes their downloaded archives actually 
had. Add lines like these to your requirements files to prevent tampering. (If you did not 
enable --require-hashes manually, note that it turns on automatically when any package has a hash.)
    Django==1.9.1 --hash=sha256:9f7ca04c6dbcf08b794f2ea5283c60156a37ebf2b8316d1027f594f34ff61101
    bgg==0.22.1 --hash=sha256:e5172c3fda0e8a42d1797fd1ff75245c3953d7c8574089a41a219204dbaad83d
    html2text==2016.1.8 --hash=sha256:088046f9b126761ff7e3380064d4792279766abaa5722d0dd765d011cf0bb079

But those are just the hashes for your particular environment (and your particular support for Python wheels). Instead, take each requirement and run it through hashin

$ hashin Django==1.9.1
$ hashin bgg==0.22.1
$ hashin html2text==2016.1.8

Now your requirements.txt will look like this:

Django==1.9.1 \
    --hash=sha256:9f7ca04c6dbcf08b794f2ea5283c60156a37ebf2b8316d1027f594f34ff61101 \
    --hash=sha256:a29aac46a686cade6da87ce7e7287d5d53cddabc41d777c6230a583c36244a18
bgg==0.22.1 \
    --hash=sha256:e5172c3fda0e8a42d1797fd1ff75245c3953d7c8574089a41a219204dbaad83d \
    --hash=sha256:aaa53aea1cecb8a6e1288d6bfe52a51408a264a97d5c865c38b34ae16c9bff88
html2text==2016.1.8 \
    --hash=sha256:088046f9b126761ff7e3380064d4792279766abaa5722d0dd765d011cf0bb079

One Last Note

pip is smart enough to traverse the nested dependencies of packages that need to be installed. For example, suppose you do:

$ hashin premailer

It will only add...

premailer==2.9.7 \
    --hash=sha256:1516cbb972234446660bf7862b28521f0fc8b5e7f3087655f35ae5dd233013a3 \
    --hash=sha256:843e624bdac9d28725b217559904aa5a217c1a94707bc2ecef6c91a8d82f1a23

...to your requirements.txt. But this package has a bunch of dependencies of its own. To find out what those are, let pip "fail for you".

$ pip install --require-hashes -r requirements.txt
Collecting premailer==2.9.7 (from -r r.txt (line 1))
  Downloading premailer-2.9.7-py2.py3-none-any.whl
Collecting lxml (from premailer==2.9.7->-r r.txt (line 1))
Collecting cssutils (from premailer==2.9.7->-r r.txt (line 1))
Collecting cssselect (from premailer==2.9.7->-r r.txt (line 1))
In --require-hashes mode, all requirements must have their versions pinned with ==. These do not:
    lxml from https://pypi.python.org/packages/source/l/lxml/lxml-3.5.0.tar.gz#md5=9f0c5f1eb43ff44d5455dab4b4efbe73 (from premailer==2.9.7->-r r.txt (line 1))
    cssutils from https://pypi.python.org/packages/2.7/c/cssutils/cssutils-1.0.1-py2-none-any.whl#md5=b173f51f1b87bcdc5e5e20fd39530cdc (from premailer==2.9.7->-r r.txt (line 1))
    cssselect from https://pypi.python.org/packages/source/c/cssselect/cssselect-0.9.1.tar.gz#md5=c74f45966277dc7a0f768b9b0f3522ac (from premailer==2.9.7->-r r.txt (line 1))

So apparently you need to hashin those three dependencies:

$ hashin lxml
$ hashin cssutils
$ hashin cssselect

Now your requirements.txt file will look something like this:

premailer==2.9.7 \
    --hash=sha256:1516cbb972234446660bf7862b28521f0fc8b5e7f3087655f35ae5dd233013a3 \
    --hash=sha256:843e624bdac9d28725b217559904aa5a217c1a94707bc2ecef6c91a8d82f1a23
lxml==3.5.0 \
    --hash=sha256:349f93e3a4b09cc59418854ab8013d027d246757c51744bf20069bc89016f578 \
    --hash=sha256:8628cc82957c41be10abce889a1976ceb7b9e3f36ebffa4fcb1a80901bf77adc \
    --hash=sha256:1c9c26bb6c31c3d5b3c104e843211d9c105db60b4df6770ac42673263d55d494 \
    --hash=sha256:01e54511034333f18772c335ec0b33a76bba988135eaf727a075897866d19604 \
    --hash=sha256:2abf6cac9b7952047d8b7265384a9565e419a727dba675e83e4b7f5b7892b6bb \
    --hash=sha256:6dff909020d0c030fb26004626c8f87f9116e0381702fed415caf94f5a9b9493
cssutils==1.0.1 \
    --hash=sha256:78ac48006ac2336b9456e88a75ed35f6a31a030c65162503b7af01a60d78db5a \
    --hash=sha256:d8a18b2848ea1011750231f1dd64fe9053dbec1be0b37563c582561e7a529063
cssselect==0.9.1 \
    --hash=sha256:0535a7e27014874b27ae3a4d33e8749e345bdfa62766195208b7996bf1100682

Ah... Now you feel confident.

Actually, One More Last Note

Sorry for repeating the obvious but it's so important it's worth making it loud and clear:

Use the same pip install procedure and requirements.txt file everywhere

I.e. Install the depdendencies the same way on your laptop, your continuous integration server, your staging server and production server. That really makes sure you're running the same process and the same dependencies everywhere.

February 01, 2016 04:34 PM

PyTennessee

PyTN Profiles: Lorena Mesa (@loooorenanicole) and Tennessee Data Commons (@tndatacommons)

Speaker Profile: Lorena Mesa (@loooorenanicole)

Political analyst turned coder, Lorena Mesa works as a platform software engineer at Sprout Social and is a co-organizer for PyLadies Chicago. Lorena loves to make meaning out of data, asking big questions and using her code to build models to derive meaning from it. Part Star Wars fanatic but mostly a Trekkie, Lorena abides by the motto to “live long and prosper”.

Lorena will be presenting “Using Python to hack my commute home” on Sunday at 1PM. Google Maps. Apple Maps. Endless map apps. Why don’t any of these speak to me as a Chicagoian commuting home? Easy - they don’t know how I like to travel or where I want to go. Solution? Let’s use Python to hack on data to find my own way home. We’ll explore APIs to find data, collect our own, and analyze it with PyLab. Time to put on our thinking caps and hack our way to a better commute home.

Sponsor Profile: Tennessee Data Commons (@tndatacommons)

Tennessee Data Commons provides a digital mentorship platform to support individual goals in ways that strengthen the community at large. We’re using social psychology, behavioral economics, and user-centered design principles to help people set goals and discover behaviors that help them succeed.

February 01, 2016 02:54 PM

Doug Hellmann

collections — Container Data Types — PyMOTW 3

The collections module includes container data types beyond the built-in types list, dict, and tuple . Read more… This post is part of the Python Module of the Week series for Python 3. See PyMOTW.com for more articles from the series.

February 01, 2016 02:00 PM

Mike Driscoll

PyDev of the Week: Oliver Schoenborn

This week we welcome Oliver Schoenborn as our PyDev of the Week. He is the author of the PyPubSub project, a version of which is included with wxPython. He has been an active contributor on the wxPython mailing list where I have always appreciated his insights. You might find his Dr. Dobbs article interesting as well, even though it’s a bit old. Let’s spend a few moments getting to know him better!

Can you tell us a little about yourself (hobbies, education, etc):

I’m a Senior Consultant at CAE Inc in Ottawa, where I engineer simulation systems for a variety of applications.

I started programming on an Apple IIe in 1982 when I was 13 years old. I bought it used, with my own money that I had saved for a few years for “some day when I would want something really big”. I discovered Assembly programming on that machine, with peeks and pokes and interrupts and registers, and was hooked. I moved on to Basic and Pascal and Prolog. I created my first simulation in my last year of high-school for a programming course project. Discovering C++ in the mid-90’s was a revelation, I found the object-oriented approach so intuitive, and I’m still a stickler for clean interfaces and refactoring. During my high-school years I thought that Physics was my passion and I received my Physics PhD in 1998 from University of Toronto, but I came to realize that programming was my real passion and have made that the focus of my professional career.

I haven’t worked in Physics since my degree, but during my PhD I developed many valuable skills such as problem solving, bug finding, testing, approximations, process modeling, and Unix development. As such, I have been fortunate to work on some very fun and challenging industrial projects, including: crane operation trainer in a fully immersive virtual environment (with a real crane cab and controls, and surround display etc);a Search and Rescue trainer which allows an instructor to challenge a student to spot and alert against threats on a military aircraft; an Avionics maintenance trainer that allows a classroom of students to each troubleshoot defects on a modern aircraft using virtual tools and virtual cockpit and work areas and a Human Resources planner that allows an Operational Research department to conduct “what-if” analyses of 100,000 employees evolving over the span of 20-30 years in the future.

Other than an obsession for programming, I love snowboarding, and playing the harmonica (blues and folk, although I don’t have much time anymore to learn anything new). If you are middle-aged and want to learn how to snowboard without breaking your rear-end, let me know and I’ll share the tricks that allowed me to enjoy this wonderful sport.

Why did you start using Python?

Oddly, I started using Python because in 2001 a project I was assigned to required Windows GUI programming, which I had never done. Since learning MFC did not interest me, I looked at a few options in C++, and came across the wxPython library, a Python API to the C++ wxWidgets library. I immediately fell in love with Python, and discovered that I could use SWIG to integrate Fortran and C++ backends into a Python + wxPython application, thus I could get my cake and eat it too.

What other programming languages do you know and which is your favorite?

I have done Basic, Prolog, Fortran 77, Pascal, Delphi, C, C++, Lua, Java, C#, Javascript. I have looked at Scala, Haskell, Go, Closure, Ruby, but none of them were compelling once I knew Python and C++.

My favorite all-purpose language is Python: the syntax is so intuitive and clean, I find I have to do very little translation to go from design and algorithm in my head into code on the screen. For tasks where speed is not the main factor, it is the perfect language.

When speed is essential, my favorite language is C++. Although, I would love to find a language that has a simple and clean syntax like Python but is as close to the metal and supports OO and generics like C++. Ideally, the language would also support type inference and function annotation, and produce equally fast code.

When application scripting is required, my favorite is Lua: designed from A to Z to be embedded in a host application via C/C++, it is simple yet so versatile (but please don’t try to use it to write a whole application).

My favourite tool to integrate all these languages is SWIG: it takes care of generating all the boiler-plate code necessary to integrate different programming languages.

What projects are you working on now?

In the little spare time that I have as a father of two active kids, I try to keep up with technology and also update PyPubSub. I would love to get involved in a cloud-based project, maybe an educational game or a mobile app, or perhaps a hardware-based app like a security system. But this would almost certainly have to be through my work due to time constraints.

At work, I am currently leading the development of a human-resources simulation GUI tool in PyQt 5.3 and Python 3.4. The tool supports Python scripts written by the user, integrates the Python debugger to enable the user to debug their scripts as they are being run in the GUI and supports multi-processor-based simulations of the user’s HR model on 24 core machines. In future, it will also support an HPC cluster of machines.

Which Python libraries are your favorite (core or 3rd party)?

For desktop GUI, I’m in love with PyQt. For web apps, I really like Flask and AngularJS and emberjs. One I have not used in a while is rpyc, a powerful remote-python-call library that can be used for distributed Python computing. I used it 2007-2009 for a distributed simulation management tool that I developed for a client.

Some core modules I find really increase productivity are subprocess, re, urllib2, timing, datetime, threading, functools, and textwrap, although I always look in the core libs first. A really powerful but unfortunately not very well documented module is bdb, which allows an end-user to step through Python scripts, manage breakpoints, introspect local variables, etc. For applications that support scripting, this is amazing.

Where do you see Python going as a programming language?

Although I don’t spend much time thinking about this, it is fun to ponder for a moment. I would guess that the need for speed, beyond the speed-ups provided by faster hardware, will pressure the language designers to find ways to make it faster, although this will almost certainly come at the cost of decreased dynamism and clarity. Multithreading will continue to be limited by the GIL, so true parallel execution will continue to evolve via the multiprocessing package and third-party cluster/cloud abstraction packages. Maintainability and robustness will continue to be important as building large Python applications continues to gain in popularity, such that “intent” based programming, likely based upon code annotations, will continue to develop. Backwards compatibility will tend to make such extensions more verbose than necessary, such that another Python 3k transition (this time to Python 4k) may be necessary in less than 10 years.

What is your take on the current market for Python programmers?

Python is often mentioned in software engineering job ads but my impression is that it is often used merely as a scripting language. The language itself, the set of libraries available for it, the library management, documentation and testing tools available for it, and its ability to integrate with C++, make it a very strong contender for full-fledged applications. But at least in Canada, it can be a challenge to find Python application programmers. Or perhaps they are all very busy working!!!

Is there anything else you’d like to say?

StackOverflow, Google search, and Open Source, are godsends. Python annotations are great, especially the way they were made optional. PyCharm is the most awesome Python IDE I have ever used (and I have used many).

Thanks for doing the interview!

February 01, 2016 01:30 PM

Péter Zsoldos

warnings.warn - some DeprecationWarning gotchas

`DeprecationWarning`s

It's a good practice to gradually deprecate one's library's API, so that users get advance warning of coming changes. The built in way to do so is Python's warnings module

import warnings

if __name__ == '__main__':
    warnings.warn('deprecation', DeprecationWarning)

By default, they are not reported!

However, if I run this code, there is no output.

$ python3.5 demo.py
$ echo $?
0

The documentation on default warning filters explains:

By default, Python installs several warning filters, which can be overridden by the command-line options passed to -W and calls to filterwarnings().

DeprecationWarning and PendingDeprecationWarning, and ImportWarning are ignored.

BytesWarning is ignored unless the -b option is given once or twice; in this case this warning is either printed (-b) or turned into an exception (-bb).

ResourceWarning is ignored unless Python was built in debug mode.

Forcing defaults from the command line makes them reported

However, if we force the default warning behavior from the command line, we get the warnings - even though theoretically we only specified how often the warnings should be reported, not which warnings to be reported!

$ python3.5 -W d demo.py
demo.py:4: DeprecationWarning: deprecation
  warnings.warn('warning', DeprecationWarning)
$ echo $?
0

Before reading the docs, I suspected the operating system maintainers didn't want to 'spam' the end users of their systems with python library warnings while going about their daily tasks, but apparently this is built into python itself.

So how should I deprecate things?

I'm a big fan of executable documentation, but this might be one of those cases where good old fashioned documentation might be more effective, as we can't expect users of our library to run with warnings enabled.

I would still leave in these deprecations for the project's developers as well as for the pedant users of the library.

Checking against deprecations of our dependencies

That's easier, as we own that code. I would run my builds and tests with -W d at the very least, but I would like to try to run with -W error. Except that I don't want to fail the build if one of my dependencies is using deprecated apis, so probably I would just have a custom main.py where I would explicitly set and reset my warning filters. E.g.: updating the above demo code would give me the following:

import warnings
import os

if os.environ.get('TEST', '0') == '1':
    warnings.filterwarnings(module='.*', action='ignore')
    warnings.filterwarnings(module=__name__, action='error')

if __name__ == '__main__':
    warnings.warn('deprecation', DeprecationWarning)

$ TEST=1 python3.5 demo.py
Traceback (most recent call last):
  File "demo.py", line 8, in <module>
    warnings.warn('deprecation', DeprecationWarning)
DeprecationWarning: deprecation
$ echo $?
1

I yet have to test how feasible it is when our library supports multiple versions of a dependent library, e.g.: Django, but my gut feeling is that it should be doable

Beware of the ordering of warning filters

If in the above example we got the order reversed, then our own DeprecationWarnings would be ignored too!

if os.environ.get('TEST', '0') == '1':
    warnings.filterwarnings(module=__name__, action='error')
    warnings.filterwarnings(module='.*', action='ignore')

The post warnings.warn - some DeprecationWarning gotchas first appeared on http://blog.zsoldosp.eu.

February 01, 2016 11:17 AM

Montreal Python User Group

Montréal-Python: Call to Action

Ladies and Gentlemen, we at Montreal Python are super excited for 2016 and we have come up with some great ideas.

In order to turn these ideas into reality, we will need some help. Montreal Python is an open community and collaboration is key to our success. So we are inviting beginners, experts and newcomers to join us at our next organization meeting on Monday February 8th.

Here we will discuss topics about:

Annual event / conference
Workshops
Hackathons / Project nights
The future of Montreal Python / elections

Montreal has a pretty exciting Python scene, and thanks to the community that is something we'll maintain for years to come. It is now your chance to come and make things happening.

When

Monday February 8th at 6pm

Where

Shopify Offices 490 rue de la Gauchetière west https://goo.gl/maps/MJrA2RN8e912

Who

Anyone who want to help or is curious about the Python Community in Montreal

For those that can't attend, don't worry, send us a email with your ideas at mtlpyteam@googlegroups.com.

February 01, 2016 05:00 AM

Stein Magnus Jodal

January contributions

The following is a short summary of my open source work in January, just like I did back in November and December.

Debian

Uploaded mopidy 1.1.2-1: New upstream release. Moved to pkg-mopidy team.
Uploaded mopidy-dleyna 1.0.5-1: New upstream release. Moved to pkg-mopidy team.
Uploaded mopidy-soundcloud 2.0.2-1: New upstream release. Moved to pkg-mopidy team.
Uploaded mopidy-somafm 1.0.1-1: New package in pkg-mopidy.
Uploaded mopidy-internetarchive 2.0.0-1: New package in pkg-mopidy.

Mopidy

PR #1381: Made lookup() ignore tracks without URI.
Updated docs for Raspberry Pi installation to match Raspbian jessie.
Rewrote docs for running Mopidy as a service, more focused on systemd than Debian specifics to also cater for Arch users.
PR #1397: Added missing MPD volume command.
Merged a bunch of contributed fixes and released Mopidy 1.1.2.
Updated all extensions hosted under the Mopidy GitHub organization with either the name of the primary maintainer or a call for a new maintainer. The extensions in need of a new maintainer are:
If you’re a user of any of these and want to contribute, please step up. Instructions can be found in the README of any of these projects.
The feature/gst1 branch is complete as far as I know. There are no known regressions from Mopidy 1.1.2. PR #1419 is hopefully the last iteration of the pull request and GStreamer 1.x support will land in Mopidy 1.2.
Wrapping up the 1.2 release is now the focus. We might want to include this in Debian/Ubuntu before the Ubuntu 16.04 import freeze February 18, depending on feedback over the next week or two.

Comics

Added one new crawler.
Released comics 2.4.2.
Merged one new crawler and a crawler update.

February 01, 2016 12:00 AM

January 31, 2016

A. Jesse Jiryu Davis

How To Hobble Your Python Web-Scraper With getaddrinfo()

This is the second article in what seems destined to be a four-part series about Python's getaddrinfo on Mac. Here, I discover that contention for the getaddrinfo lock makes connecting to localhost appear to time out.

Network Timeouts From asyncio

A Washington Post data scientist named Al Johri posted to the MongoDB User Group list, asking for help with a Python script. His script downloaded feeds from 500 sites concurrently and stored the feeds' links in MongoDB. Since this is the sort of problem async is good for, he used my async driver Motor. He'd chosen to implement his feed-fetcher on asyncio, with Motor's new asyncio integration and Andrew Svetlov's aiohttp library.

Al wrote:

Each feed has a variable number of articles (average 10?). So it should launch around 5000+ "concurrent" requests to insert into the database. I put concurrent in quotes because it's sending the insert requests as the downloads come in so it really shouldn't be that many requests per second. I understand PyMongo should be able to do at least 20k-30k plus?

He's right. And yet, Motor threw connection timeouts every time he ran his script. What was going wrong with Motor?

Three Clues

It was a Saturday afternoon when I saw Al's message to the mailing list; I wanted to leave it until Monday, but I couldn't stand the anxiety. What if my driver was buggy?

In Al's message I saw three clues. The first clue was, Motor made its initial connection to MongoDB without trouble, but while the script downloaded feeds and inserted links into the database, Motor began throwing timeouts. Since Motor was already connected to MongoDB, and since MongoDB was running on the same machine as his code, it seemed it must be a Motor bug.

I feel like what I'm trying to accomplish really shouldn't be this hard.

Al's code also threw connection errors from aiohttp, but this was less surprising than Motor's errors, since it was fetching from remote servers. Still, I noted this as a possible second clue.

The third clue was this: If Al turned his script's concurrency down from 500 feeds to 150 or less, Motor stopped timing out. Why?

Investigation

On Sunday, I ran Al's script on my Mac and reproduced the Motor errors. This was a relief, of course. A reproducible bug is a tractable one.

With some print statements and PyCharm, I determined that Motor occasionally expands its connection pool in order to increase its "insert" concurrency. That's when the errors happen.

I reviewed my connection-pool tests and verified that Motor can expand its connection pool under normal circumstances. So aiohttp must be fighting with Motor somehow.

I tracked down the location of the timeout to this line in the asyncio event loop, where it begins a DNS lookup on its thread pool:

def create_connection(self):
    executor = self.thread_pool_executor

    yield from executor.submit(
        socket.getaddrinfo, 
        host, port, family, type, proto, flags)

Motor's first create_connection call always succeeded, but later calls sometimes timed out.

I wondered what the holdup was in the thread pool. So I printed its queue size before the getaddrinfo call:

# Ensure it's initialized.
if self._default_executor:
    q = self._default_executor._work_queue

    print("unfinished tasks: %d" % 
          q.unfinished_tasks)

There were hundreds of unfinished tasks! Why were these lookups getting clogged? I tried increasing the thread pool size, from the asyncio default of 5, to 50, to 500....but the timeouts happened just the same.

Eureka

I thought about the problem as I made dinner, I thought about it as I fell asleep, I thought about it while I was walking to the subway Monday morning in December's unseasonable warmth.

I recalled a PyMongo investigation where Anna Herlihy and I had explored CPython's getaddrinfo lock: On Mac, Python only allows one getaddrinfo call at a time. I was climbing the stairs out of the Times Square station near the office when I figured it out: Al's script was queueing on that getaddrinfo lock!

Diagnosis

When Motor opens a new connection to the MongoDB server, it starts a 20-second timer, then calls create_connection with the server address. If hundreds of other getaddrinfo calls are already enqueued, then Motor's call can spend more than 20 seconds waiting in line for the getaddrinfo lock. It doesn't matter that looking up "localhost" is near-instant: we need the lock first. It appears as if Motor can't connect to MongoDB, when in fact it simply couldn't get the getaddrinfo lock in time.

My theory explains the first clue: that Motor's initial connection succeeds. In the case of Al's script, specifically, Motor opens its first connection before aiohttp begins its hundreds of lookups, so there's no queue on the lock yet.

Then aiohttp starts 500 calls to getaddrinfo for the 500 feeds' domains. As feeds are fetched it inserts them into MongoDB.

There comes a moment when the script begins an insert while another insert is in progress. When this happens, Motor tries to open a new MongoDB connection to start the second insert concurrently. That's when things go wrong: since aiohttp has hundreds of getaddrinfo calls still in progress, Motor's new connection gets enqueued, waiting for the lock so it can resolve "localhost" again. After 20 seconds it gives up. Meanwhile, dozens of other Motor connections have piled up behind this one, and as they reach their 20-second timeouts they fail too.

Motor's not the only one suffering, of course. The aiohttp coroutines are all waiting in line, too. So my theory explained the second clue: the aiohttp errors were also caused by queueing on the getaddrinfo lock.

What about the third clue? Why does turning concurrency down to 150 fix the problem? My theory explains that, too. The first 150 hostnames in Al's list of feeds can all be resolved in under 20 seconds total. When Motor opens a connection it is certainly slow, but it doesn't time out.

Verification

An explanatory theory is good, but experimental evidence is even better. I designed three tests for my hypothesis.

First, I tried Al's script on Linux. The Python interpreter doesn't lock around getaddrinfo calls on Linux, so a large number of in-flight lookups shouldn't slow down Motor very much when it needs to resolve "localhost". Indeed, on Linux the script worked fine, and Motor could expand its connection pool easily.

Second, on my Mac, I tried setting Motor's maximum pool size to 1. This prevented Motor from trying to open more connections after the script began the feed-fetcher, so Motor never got stuck in line behind the fetcher. Capping the pool size at 1 didn't cost any performance in this application, since the script spent so little time writing to MongoDB compared to the time it spent fetching and parsing feeds.

For my third experiment, I patched the asyncio event loop to always resolve "localhost" to "127.0.0.1", skipping the getaddrinfo call. This also worked as I expected.

Solution

I wrote back to Al Johri with my findings. His response made my day:

Holy crap, thank you so much. This is amazing!

I wish bug investigations always turned out this well.

But still—all I'd done was diagnose the problem. How should I solve it? Motor could cache lookups, or treat "localhost" specially. Or asyncio could make one of those changes instead of Motor. Or perhaps the asyncio method create_connection should take a connection timeout argument, since asyncio can tell the difference between a slow call to getaddrinfo and a genuine connection timeout.

Which solution did I choose? Stay tuned for the next installment!

Links:

Images: Lunardo Fero, embroidery designs, Italy circa 1559. From Fashion and Virtue: Textile Patterns and the Print Revolution 1520–1620, by Femke Speelberg.

January 31, 2016 10:54 PM

Anarcat

My free software activities, January 2016

Debian Long Term Support (LTS)

This is my second month working on Debian LTS, started by Raphael Hertzog at Freexian. I think this month has been a little better for me, as I was able to push two "DLA" (Debian LTS Advisories, similar to regular DSAs (Debian Security Advisories) but only applying to LTS releases).

phpMyAdmin and Prosody

I pushed DLAs for phpmyadmin (DLA-406-1) and [prosody (CVE-2016-0756)][]. Both were pretty trivial, but I still had to boot a squeeze VM to test the resulting packages, something that was harder than expected. Still the packages were accepted in squeeze-lts and should work fine.

icu and JDK vulnerabilities

I also spent a good amount of time trying to untangle the mess that java software has become, and in particular the icu vulnerabilities, CVE-2015-4844 and CVE-2016-0494. I ended up being able to backport patches and build packages, not without a significant amount of pain because of how upstream failed to clearly identify which patches did what.

The fact that they (Oracle) did not notify their own upstream (icu) is also a really questionable practice in the free software world, which doesn't come as a surprise coming from Oracle anymore, unfortunately. Even worse, CVE-2016-0494 was actually introduced as part of the fix for CVE-2015-4844. I am not even sure the patches provided actually fix the problem because of course Oracle didn't clearly state what the problem was or how to exploit it.

Still: I did the best I could under the circumstances and built packages which I shared with the debian-lts list in the hope others could test it. I am not much familiar with the icu package or even Java anymore, so I do not feel comfortable uploading those fixes directly right now, especially since I just trust whatever was said on the Redhat and icu bugtrackers. Hopefully someone else can pick this up and confirm I had the right approach.

OpenSSH vulnerabilities

I also worked on CVE-2016-1908 a fairly awkward vulnerability in OpenSSH involving bypassing a security check in the X server that forbids certain clients from looking at keystrokes, selections and more stuff from other clients. The problem is pretty well described in this article. It is, basically, that there are two ways for applications to talk to the X server: "trusted" and "untrusted". If an application is "trusted", they can do all sorts of stuff like manipulate the clipboard, send keystrokes to other applications, sniff keystrokes and so on. This seems fine if you are running local apps (a good example is xdotool to test this) but can be pretty bad once X forwarding comes in play in SSH because then the remote server can use your X credentials to run arbitrary X code in your local X server. In other words, once you forward X, you trust the remote server as being local, more or less.

This is why OpenSSH 3.8 introduced the distinction between -X (untrusted) and -Y (trusted). Unfortunately, after quite a bit of research and work to reproduce the issue (i could not not reproduce the issue!), I realized that Debian has, even since 3.8 has been released (around the "sarge" days!) forcibly defaulted ForwardX11Trusted to yes which makes -X and -Y behave the same way. I described all of this in a post to the LTS list and OpenSSH maintainers and it seems there were good reasons for this back then (-X actually breaks a lot of X clients, for example selecting text will crash xterm), but I still don't quite see why we shouldn't tell people to use -Y consciously if they need to, instead of defaulting to the nasty insecure behavior.

Anyways, this will probably end up being swept under the rug for usability reasons, but just keep in mind that -X can be pretty nasty if you run it against an untrusted server.

Xscreensaver vulnerability

This one was fun. JWZ finally got bitten by his own rants and a pretty embarrassing vulnerability (CVE-2015-8025) that allowed one to crash the login dialog (and unlock the screen) by hot swapping external monitors (!). I worked on trying to reproduce the issue (I couldn't: my HDMI connector stopped working on my laptop, presumably because of a Linux kernel backport) and building the patches provided by the maintainer on wheezy, and pushed debdiffs to the security team which proceeded with the upload.

Other LTS work

I still spend a bit too much time to my taste trying to find work in LTS land. Very often, all the issues are assigned or the ones that remain seem impossible to fix (like the icu vulnerabilities) or packages so big that I am scared to work on it (like eglibc). Still, the last week of work was much better, thanks to the excellent work that Guido Günther has done on the front desk duties this week. My turn is coming up next week and I hope I can do the same for my fellow LTS workers.

Oh, and I tried to reproduce the cpio issue (CVE-2016-2037) and failed, because I didn't know enough about Valgrind. But even then, I don't exactly know where to start to fix that issue. It seems no one does, because this unmaintained package is still not fixed anywhere...

systemd-nspawn adventures

In testing the OpenSSH, phpMyAdmin and prosody issues, I had high hopes that systemd-nspawn would enable me to run an isolated squeeze container reliably. But I had trouble: for some reason, squeeze does not seem to like nspawn at all. First off, it completely refuses to boot it because it doesn't recognize it as an "OS root directory", which, apparently, need the os-release file (in /etc, but it doesn't say that because it would be too easy):

$ sudo systemd-nspawn -b  -D /var/cache/pbuilder/squeeze-amd64-vm
Directory /home/pbuilder/squeeze-amd64-vm doesn't look like an OS root directory (os-release file is missing). Refusing.

I just created that as an empty file (it works: i also tried copying it from a Jessie system and "faking" the data in it, but it's not necessary), and then nspawn accepts booting it. The next problem is that it just hangs there: it seems that the getty programs can't talk to the nspawn console:

$ sudo systemd-nspawn -b  -D /var/cache/pbuilder/squeeze-amd64-vm
Spawning container squeeze-amd64-vm on /home/pbuilder/squeeze-amd64-vm.
Press ^] three times within 1s to kill container.
/etc/localtime is not a symlink, not updating container timezone.
INIT: version 2.88 booting
Using makefile-style concurrent boot in runlevel S.
Setting the system clock.
Cannot access the Hardware Clock via any known method.
Use the --debug option to see the details of our search for an access method.
Unable to set System Clock to: Sun Jan 31 15:57:31 UTC 2016 ... (warning).
Activating swap...done.
Setting the system clock.
Cannot access the Hardware Clock via any known method.
Use the --debug option to see the details of our search for an access method.
Unable to set System Clock to: Sun Jan 31 15:57:31 UTC 2016 ... (warning).
Activating lvm and md swap...done.
Checking file systems...fsck from util-linux-ng 2.17.2
done.
Mounting local filesystems...done.
Activating swapfile swap...done.
Cleaning up temporary files....
Cleaning up temporary files....
INIT: Entering runlevel: 2
Using makefile-style concurrent boot in runlevel 2.
INIT: Id "2" respawning too fast: disabled for 5 minutes
INIT: Id "1" respawning too fast: disabled for 5 minutes
INIT: Id "3" respawning too fast: disabled for 5 minutes
INIT: Id "4" respawning too fast: disabled for 5 minutes
INIT: Id "5" respawning too fast: disabled for 5 minutes
INIT: Id "6" respawning too fast: disabled for 5 minutes
INIT: no more processes left in this runlevel

Note that before the INIT messages show up, quite a bit of time passes, around a minute or two. And then the container is just stuck there: no login prompt, no nothing. Turning off the VM is also difficult:

$ sudo machinectl list
MACHINE                          CONTAINER SERVICE
squeeze-amd64-vm                 container nspawn

1 machines listed.
$ sudo machinectl status squeeze-amd64-vm
squeeze-amd64-vm
           Since: dim 2016-01-31 10:57:31 EST; 4min 44s ago
          Leader: 3983 (init)
         Service: nspawn; class container
            Root: /home/pbuilder/squeeze-amd64-vm
         Address: fe80::ee55:f9ff:fec5:f255
                  2001:1928:1:9:ee55:f9ff:fec5:f255
                  fe80::ea9a:8fff:fe6e:f60
                  2001:1928:1:9:ea9a:8fff:fe6e:f60
                  192.168.0.166
            Unit: machine-squeeze\x2damd64\x2dvm.scope
                  ├─3983 init [2]
                  └─4204 lua /usr/bin/prosody
$ sudo machinectl poweroff squeeze-amd64-vm
# does nothing
$ sudo machinectl terminate squeeze-amd64-vm
# also does nothing
$ sudo kill 3983
# does nothing
$ sudo kill -9 3983
$

So only the latter kill -9 worked:

Container squeeze-amd64-vm terminated by signal KILL.

Pretty annoying! So I ended up doing all my tests in a chroot, which involved shutting down the web server on my laptop (for phpmyadmin) and removing policy-rc.d to allow the services to start in the chroot. That worked, but I would prefer to run that code in a container. I'd be happy to hear how other maintainers are handling this kind of stuff.

For the OpenSSH vulnerability testing, I also wanted to have a X server running from squeeze, something which I found surprisingly hard. I was not able to figure out how to make [qemu][] boot from a directory (the above chroot), so I turned to the Squeeze live images from the cdimage archive. Qemu, for some reason, was not able to boot those either: I would only get a grey screen. So I ended up installing Virtualbox, which worked perfectly, but I'd love to hear how I could handle this better as well.

Other free software work

As usual, I did tons more stuff on the computer this month. Way more than I should, actually. I am trying to take some time to reflect upon my work and life these days, and the computer is more part of the problem than the solution, so those feel like a vice that I can't get rid of more than an accomplishment. Still, you might be interested to know about those, so here they are.

Ledger timetracking

I am tracking the time I work on various issues through the overwhelming org-mode in Emacs. The rationale I had was that I didn't want to bother writing yet another time tracker, having written at least two before. One is the old phpTimeTracker, the other is a rewrite that never got anywhere, and finally, I had to deal with the formidable kProject during my time at Koumbit.org. All of those made me totally allergic to project trackers, timetrackers, and reinventing the wheel, so I figured it made sense to use an already existing solution.

Plus, org-mode allows me to track todos in a fairly meaningful way and I can punch into todo items fairly transparently. I also had a hunch I could bridge this with ledger, a lightweight accounting tool I started using recently. I was previously using the heavier Gnucash, almost a decade ago - but I really was seduced by the idea of a commandline tool that stores its data in a flat file that I can checkin to git.

How wrong I was! First off, ledger can't read org files out of the box. It's weird, but you need to convert those files into timeclock.el formatted files, which, oddly enough, is a completely different file format, and completely different timetracker. Interestingly, it's a very interesting format. An example:

i 1970-01-01 12:00:00 project  test 4h
o 1970-01-01 16:00:00

... which makes it possible to write a timetracker with two simple shell aliases:

export TIMELOG=$HOME/.timelog
alias ti="echo i `date '+%Y-%m-%d %H:%M:%S'` \$* >>$TIMELOG"
alias to="echo o `date '+%Y-%m-%d %H:%M:%S'` >>$TIMELOG"

How's that for simplicity!

So you use John Wiegley's org2tc (or my fork which adds a few improvements) to convert from org to timeclock.el. From there on, a bunch of tools can do reporting on those files, the most interesting being obviously ledger itself, as it can read those files natively (although hledger has trouble including them). So far so good: I can do time tracking very easily and report on my time now!

Now, to turn this into bills and actual accounting, well... it's really much more complicated. To make a long story short, it works, but I really had to pull my hair out and ended up making yet another git repository to demonstrate how that could work. I am now stuck at the actual step of generating bills more automatically, which seems to be a total pain. Previous examples and documentation I found were limited and while I feel that some people are actively doing this, they still have to reveal their magic sauce in a meaningful way. I was told on IRC no one actually achieved converting timeclock.el entries directly into bills...

Kodi

I have done a small patch to the rom collection browser to turn off an annoying "OK" dialog that would block import of ROMs. This actually was way more involved than expected, considering the size of the patch: I had to export the project to Github since the original repository at Google Code is now archived, just like all Google Code repositories. I hope someone will pick it up from there

Sopel

I have finally got my small patch for SNI support in Sopel! It turns out they are phasing out their own web module in favor of Requests, something that was refused last year. It seems the Sopel developers finally saw the interest in avoiding the maintenance cost of their own complete HTTP library... in an IRC bot.

Working on this patch, I filed a bug in requests which was promptly fixed.

Feed2tweet and spigot

I already mentioned how I linked this blog to Twitter through the use of feed2tweet. Since then, part of my pull requests and issues were merged and others are still pending.

In the meantime, I figured it would make sense to also post to identi.ca. This turned out to be surprisingly more difficult - the only bridge available would not work very well for me. I also filed a bunch of issues in the hope things would stabilize, but so far I have not made this work properly.

It seems to me that all of this stuff is really just reinventing the wheel. There are pretty neat IM libraries out there, one that is actively developed is libturpial, used in the Turpial client. It currently only supports status.net and Twitter, but if Pump.io support is implemented, it would all of the above problems at once...

Git-mediawiki

Did I mention how awesome the git-mediawiki remote is? It allows me to clone Mediawiki wikis and transparently read and write to them using usual git commands! I use it to keep a mirror of the amateur radio wiki site, for example, as it makes no sense to me to not have this site available offline. I was trying to mirror Wikivoyage and it would block at 500 pages, so I made a patch to support larger wikis.

Borg resignation

Finally, it is with great sadness that I announce that I have left the Borg backup project. It seems that my views of release and project management are irreconcilable with those of the maintainer of the Attic fork. Those looking for more details and explanations are welcome to look in the issue tracker for the various discussions regarding i18n, the support timeframe and compatibility policy, or to contact me personally.

January 31, 2016 10:18 PM

Kushal Das

Second Fedora Pune meetup in January

On 22nd of January evening we had the second Fedora meetup in Pune. There were 12 participants in this meet. We started a discussion about what happens when someone compiles a program written in C. Siddhesh started asking various questions about what do we think? He then started going into details for each steps into compiler, and assembler. We discussed about ELF files, went through various sections. After looking into __init, __fini, __main functions one of the participant said “my whole life was a lie!” :D No one thought about constructor, and destructor in a C program.

We also discussed about for loops in C, and in Python, and about list comprehensions. One more new point to me was that there are 6 registers available for each function call in x86. At the end the participants decided to find bugs/features in different projects they use regularly (we also suggested about Gnome apps), first everyone will try to fix those at home, and if they can not fix by next meeting, we will look into helping them during the meeting.

January 31, 2016 05:55 PM

Investing using Python

Garman-Klass volatility estimator and HAR-RV model for realized volatility using Python

HAR-RV ("Heterogeneous AutoRegressive model fro Realized Volatility") is pretty simple model based on the so called “Heterogeneous Market Hypothesis” which states that financial markets move as an interaction of market players acting at different frequencies (like, intraday, day, week or month).1 The HAR-RV(1) formula: For the sake of interest I've tried too feed it with […]

January 31, 2016 05:47 PM

Podcast.init

Episode 42 - SymPy With Aaron Meurer

Visit our site to listen to past episodes, support the show, join our community, and sign up for our mailing list.

Summary

Looking for an open source alternative to Mathematica or MatLab for solving algebraic equations? Look no further than the excellent SymPy project. It is a well built and easy to use Computer Algebra System (CAS) and in this episode we spoke with the current project maintainer Aaron Meurer about its capabilities and when you might want to use it.

Brief Introduction

Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
Subscribe on iTunes, Stitcher, TuneIn or RSS
Follow us on Twitter or Google+
Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+
Join our community at discourse.pythonpodcast.com to follow up with the guests and help us make the show better!
nn
I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com
Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project
I would also like to thank Hired, a job marketplace for developers, for sponsoring this episode of Podcast.__init__. Use the link hired.com/podcastinit and double your signing bonus to $4,000.
We are recording today on January 18th, 2016 and your hosts as usual are Tobias Macey and Chris Patti
Today we are interviewing Aaron Meurer about SymPy

Use the promo code podcastinit20 to get a $20 credit when you sign up!

On Hired software engineers & designers can get 5+ interview requests in a week and each offer has salary and equity upfront. With full time and contract opportunities available, users can view the offers and accept or reject them before talking to any company. Work with over 2,500 companies from startups to large public companies hailing from 12 major tech hubs in North America and Europe. Hired is totally free for users and If you get a job you’ll get a $2,000 “thank you” bonus. If you use our special link to signup, then that bonus will double to $4,000 when you accept a job. If you’re not looking for a job but know someone who is, you can refer them to Hired and get a $1,337 bonus when they accept a job.

Interview with Aaron Meurer

Introductions
How did you get introduced to Python? - Chris
What is Sympy and what kinds of problems does it aim to solve? - Chris
How did the SymPy project get started? - Tobias
How did you get started with the SymPy project? - Chris
Are there any limits to the complexity of the equations SymPy can model and solve? - Chris
How does SymPy compare to similar projects in other languages? - Tobias
How does Sympy render results using such beautiful mathematical symbols when the inputs are simple ASCII? - Chris
What are some of the challenges in creating documentation for a project like SymPy that is accessible to non-experts while still having the necessary information for professionals in the fields of mathematics? - Tobias
Which fields of academia and business seem to be most heavily represented in the users of SymPy? - Tobias
What are some of the uses of Sympy in education outside of the obvious like students checking their homework? - Chris
How does SymPy integrate with the Jupyter Notebook? - Chris
Is SymPy generally used more as an interactive mathematics environment or as a library integrated within a larger application? - Tobias
What were the challenges moving SymPy from Python 2 to Python 3? - Chris
Are there features of Python 3 that simplify your work on SymPy or that make it possible to add new features that would have been too difficult previously? - Tobias
Were there any performance bottlenecks you needed to overcome in creating Sympy? - Chris
What are some of the interesting design or implementation challenges you’ve found when creating and maintaining SymPy? - Chris
Are there any new features or major updates to SymPy that are planned? - Tobias
How is the evolution of SymPy managed from a feature perspective? Have there been any occasions in recent memory where a pull request had to be rejected because it didn’t fit with the vision for the project? - Tobias
Which of the features of SymPy do you find yourself using most often? - Tobias

Picks

Keep In Touch

Links

Project Euler
Richardson’s Theorem
Doing Math With Python by Amit Saha (and Aaron’s book review)

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Visit our site to listen to past episodes, support the show, join our community, and sign up for our mailing list. Summary Looking for an open source alternative to Mathematica or MatLab for solving algebraic equations? Look no further than the excellent SymPy project. It is a well built and easy to use Computer Algebra System (CAS) and in this episode we spoke with the current project maintainer Aaron Meurer about its capabilities and when you might want to use it. Brief Introduction Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great. Subscribe on iTunes, Stitcher, TuneIn or RSS Follow us on Twitter or Google+ Give us feedback! Leave a review on iTunes, Tweet to us, send us an email or leave us a message on Google+ Join our community at discourse.pythonpodcast.com to follow up with the guests and help us make the show better! nnI would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable. For details on how to support the show you can visit our site at pythonpodcast.com Linode is sponsoring us this week. Check them out at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for your next project I would also like to thank Hired, a job marketplace for developers, for sponsoring this episode of Podcast.__init__. Use the link hired.com/podcastinit and double your signing bonus to $4,000. We are recording today on January 18th, 2016 and your hosts as usual are Tobias Macey and Chris Patti Today we are interviewing Aaron Meurer about SymPy Use the promo code podcastinit20 to get a $20 credit when you sign up! On Hired software engineers designers can get 5+ interview requests in a week and each offer has salary and equity upfront. With full time and contract opportunities available, users can view the offers and accept or reject them before talking to any company. Work with over 2,500 companies from startups to large public companies hailing from 12 major tech hubs in North America and Europe. Hired is totally free for users and If you get a job you’ll get a $2,000 “thank you” bonus. If you use our special link to signup, then that bonus will double to $4,000 when you accept a job. If you’re not looking for a job but know someone who is, you can refer them to Hired and get a $1,337 bonus when they accept a job. Interview with Aaron Meurer Introductions How did you get introduced to Python? - Chris What is Sympy and what kinds of problems does it aim to solve? - Chris How did the SymPy project get started? - Tobias How did you get started with the SymPy project? - Chris Are there any limits to the complexity of the equations SymPy can model and solve? - Chris How does SymPy compare to similar projects in other languages? - Tobias How does Sympy render results using such beautiful mathematical symbols when the inputs are simple ASCII? - Chris What are some of the challenges in creating documentation for a project like SymPy that is accessible to non-experts while still having the necessary information for professionals in the fields of mathematics? - Tobias Which fields of academia and business seem to be most heavily represented in the users of SymPy? - Tobias What are some of the uses of Sympy in education outside of the obvious like students checking their homework? - Chris How does SymPy integrate with the Jupyter Notebook? - Chris Is SymPy generally used more as an interactive mathematics environment or as a library integrated within a larger application? - Tobias What were the challenges moving SymPy from Python 2 to Python 3? - Chris Are there features of Python 3 that simplify your work on SymPy or that make it possible to add new features that would have been too difficult previously? - Tobias Were there any performance bottlenecks you needed to overcome in creating Sympy? - Chris What are some of the interesting design or implementation challenges you've foun

January 31, 2016 02:02 PM

Jorgen Schäfer

Elpy 1.11.0 released

I just released version 1.11.0 of Elpy, the Emacs Python Development Environment. This is a feature release.

Elpy is an Emacs package to bring powerful Python editing to Emacs. It combines and configures a number of other packages, both written in Emacs Lisp as well as Python.

Quick Installation

Evaluate this:

(require 'package)
(add-to-list 'package-archives
             '("elpy" .
               "https://jorgenschaefer.github.io/packages/"))

Then run M-x package-install RET elpy RET.

Finally, run the following (and add them to your .emacs):

(package-initialize)
(elpy-enable)

Changes in 1.11.0

Elpy now supports yapf in addition to autopep8 to format your code.
You can now adjust whether Elpy should hide modes from the mode line or not using elpy-remove-modeline-lighter
When the new option elpy-disable-backend-error-display is set, Elpy will not show its error pop-up anymore. This can be useful if you run into an annoying bug in Jedi, for example.
New command elpy-goto-definition-other-window on C-x 4 M-..
Expanding super now gives the short form supported in Python 3.
All Rope errors are now caught, as the upstream maintainers did not show interest in distinguishing between malformed input and bugs in their library.
Various other bug fixes.

Thanks to ChillarAnand, Clément Pit–Claudel, Daniel Gopar, Moritz Kuett, Shuai Lin and Zhaorong Ma for their contributions!

January 31, 2016 10:26 AM

Brian Okken

Test Case Design using Given-When-Then from BDD (PT010)

Designing your test methods using a simple structure such as given-when-then will help you:

Communicate the purpose of your test more clearly
Focus your thinking while writing the test
Make test writing faster
Make it easier to re-use parts of your test
Highlight the assumptions you are making about the test preconditions
Highlight what outcomes you are expecting and testing against.

Discussed:

The Given-When-Then structure for test method/function development.
How and why to utilize fixtures for your given or precondition code.
Similarities with other structure discriptions.
- Setup-Test-Teardown
- Setup-Excercise-Verify-Teardown.
- Arrange-Act-Assert
- Preconditions-Trigger-Postconditions.
Benefits listed above discussed in more detail.

The post Test Case Design using Given-When-Then from BDD (PT010) appeared first on Python Testing.

January 31, 2016 07:27 AM

January 30, 2016

John Cook

General birthday problem

The birthday problem, sometimes called the birthday paradox, says that it’s more likely than you’d expect that two people in a group have the same birthday. Specifically, in a random sample of 23 people, there’s about a 50-50 chance that two people share the same birthday.

The birthday problem makes a nice party trick, but generalizations of the problem come up frequently in applications. I wrote in the previous post how it comes up in seeding distributed Monte Carlo simulations. In computer science, it’s a concern in hashing algorithms.

If you have a set of N things to choose from, such as N = 365 birthdays, and take r samples, the probability that all r samples are unique is

$p = \frac{N!}{N^r (N-r)!}$

and the probability that at least two of the samples are the same is 1 – p. (This assumes that N is at least as big as r. Otherwise the denominator is undefined, but in that case we know p is 0.)

With moderately large values of N and r the formula is likely to overflow if implemented directly. So as usual the trick is to use logarithms to avoid overflow or underflow. Here’s how you could compute the probability above in Python. SciPy doesn’t have a log factorial function, but does have a log gamma function, so we use that instead.

    from scipy import exp, log
    from scipy.special import gammaln

    def prob_unique(N, r):
        return exp( gammaln(N+1) - gammaln(N-r+1) - r*log(N) )

January 30, 2016 09:35 PM

BangPypers

January 2016 Salt stack workshop report

We kickstarted this year's meetup with salt stack workshop at mavenhive technologies. There were 23 participants.

Ram took the session. He started with what is devops, why we need dev ops and various tools available for devops.

Then he explained about salt stack, master, minion, grains, pillars, modules e.t.c and how to automate configuration management on multiple servers.

There was a break for about 10 minutes.

After resuming from break, participants then tried to deploy a web app (Junction) to a virtual machine using saltstack. There were questions about grains, pillars from participants and Ram explained clarified about them.

Here are few pictures from the meetup.

Thanks Ram for taking the session and Mavenhive for hosting us.

January 30, 2016 01:45 PM

January 29, 2016

Investing using Python

Monte Carlo for binary options, all odds for the house

A quick remake of simple Monte Carlo for binaries as I was interested whether so called "81% payout" wasn't calculated all the risks for the house (dealer). So, let's say I have proven method ("advantage"/ "edge") to win 55% of the time, and I want to calculate how much money would I need if I […]

January 29, 2016 07:27 PM

Weekly Python StackOverflow Report

(iv) stackoverflow python report

These are the ten most rated questions at Stack Overflow last week.
Between brackets: [question score / answers count]
Build date: 2016-01-29 18:11:08 GMT

January 29, 2016 06:18 PM

PyTennessee

PyTN Profiles: Matt George (@binarydud) and Emma (@emmaemail)

Speaker Profile: Matt George (@binarydud)

Matt is a software developer hailing from Erie, CO. He has been working with python for over 6 years and is currently a Software Developer for Rackspace.

Matt will be present “Managing an event oriented architecture in AWS” at 1PM Sunday. With things like kinesis streams, lambda functions, and cloudwatch alarms AWS gives you an incredible amount of tools to design and build applications. Learn how to deploy, manage, and monitor your infrastructure in AWS using an event oriented approach.

Sponsor Profile: Emma (@emmaemail)

Emma helps you build smarter email programs that maximize the power of marketing’s most effective channel.

January 29, 2016 04:16 PM

PyTN Profiles: Derik Pell and The Iron Yard (@TheIronYard)

Speaker Profile: Derik Pell

Derik got his Masters in Computer Science from The University of Illinois at Springfield where he was lucky enough to get to study data from many different angles. That work started him on a love affair with picking apart data while writing the best code he can.

He is happy to have lived through Java so he can now enjoy writing Python at Emma in Nashville, TN, where he lives with his partner, his son, and the head of the household, a chihuahua named Kala.

Derik will be presenting “Maybe I shouldn’t be a programmer.” at 11AM Sunday. Imposter syndrome isn’t just a catch phrase, it’s a real crisis of confidence that we all face as developers, one which can be especially hard for new developers who may not realize that it often comes with the job. This talk will help bring this topic to light by discussing things that can cause imposter syndrome and ways of coping and getting through these bumps.

Sponsor Profile: The Iron Yard (@TheIronYard)

The Iron Yard is the world’s largest code school. It exists to create exceptional growth for people and their ideas through tech-focused education. It offers full-time and part-time programs in Back-End Engineering, Front-End Engineering, Mobile Engineering, Data Science and Design. Locations include 17 campuses in the U.S. and UK.

January 29, 2016 01:15 PM

Import Python

ImportPython Issue 59

Word From Our Sponsor

Python Programmers let companies apply to you, not the other way around. Receive interview offers which include salary and equity information. Companies see each other's offers, and compete for your attention. Engage only with the companies you like. REGISTER

Worthy Read

Learn to program with minecraft - Book Review

book review

Minecraft is a sandbox video game. A game where you build constructions out of textured cubes in a 3D generated world. One can then explore worlds, do resource gathering, plan combat. What makes minecraft interesting is you can use Python API and create, control the world using Python. Here is a review of the book.

Python testing, book and podcast

book review

Ned batchelder endorses Harry Percival's book Test-Driven Development with Python. If you have been reading this newsletter for long you would know we love the book. Here's another proof it's worth reading.

Deploy a Python 3 Application to an Apache Mesos Cluster using the Marathon API

Where I work we are actively using Apache Mesos to deploy our production applications written in Go, Python, Lua, etc. We also use Chronos and Marathon. In this article I’ll show you how to setup a local Mesos Cluster complete with Marathon (using Docker) and how to build a simple Python 3.x application to deploy into it.

caktus/margarita: A collection of delicious Salt states for Django project deployments.

This repository holds a collection of states and modules for deployments using SaltStack. These exist primarily to support the Caktus Django project template.

Your Django Story: Meet Katie Bell

interview

Katie Bell is a developer at Grok Learning, where she’s been doing a combination of things since joining the team in March 2015. She builds new components of the learning platform and also writes course content. Grok Learning provides programming and web development courses to be used in schools. Before Katie moved back to Sydney to join Grok, she was a Site Reliability Engineer at Google in Switzerland, working on storage systems.

meza: A Python toolkit for processing tabular data

meza is a Python library for reading and processing tabular data. It has a functional programming style API, excels at reading/writing large files, and can process 10+ file types.

PyDelhi Conference

pycon

PyDelhi Conference is an upcoming conference hosted by PyDelhi Community which focuses on using and developing using the Python programming language. The conference, now in its first year, will be conducted annually by the PyDelhi community. We hope to attract the best Python programmers from across the country and abroad.

Examples of using Walrus, a lightweight Redis Toolkit

redis

walrus is my go-to toolkit for working with Redis in Python, and hopefully this post will convince you that it can be your go-to as well. I've tried to include lots of high-level Python APIs built on Redis primitives and the result is quite a lot of functionality. In this post I'll take you on a tour of the library and show examples of how it might be useful in your next project.

Python based URL Shortener

flask

Py URL Shortener is a Python powered Flask app implementing a technique in which a Uniform Resource Locator (URL) has been made substantially shorter in length and still direct to the required page. This is achieved by using a redirect on a domain name that is short, which links to the web page that has a long URL.

PEP 513 - A Platform Tag for Portable Linux Built Distributions

PEP proposes the creation of a new platform tag for Python package built distributions, such as wheels, called manylinux1_{x86_64,i386} with external dependencies limited to a standardized, restricted subset of the Linux kernel and core userspace ABI. It proposes that PyPI support uploading and distributing wheels with this platform tag, and that pip support downloading and installing these packages on compatible platforms.

django-scribbler

django

django-scribbler is an application for managing snippets of text for a Django website. http://readthedocs.org/docs/django-scribbler/

Jobs

Senior Python/Java developer at Psiog Data Science

Chennai, Tamil Nadu, India

Projects

whatportis - 104 Stars, 12 Fork

A command to search port names and numbers

django-stackoverflow-trace - 104 Stars, 3 Fork

A customized django stack trace

flyover - 62 Stars, 5 Fork

what's that plane flying over my apartment RIGHT NOW?

open-syllabus-project - 30 Stars, 0 Fork

What can be learned from 1M+ college course syllabi?

jenkins-phoenix - 21 Stars, 5 Fork

Stateless Jenkins deployment with Docker

django-post-request-task - 16 Stars, 2 Fork

A celery task class whose execution is delayed until after the request finishes, using request_started and request_finished signals from django.

preprocessor - 12 Stars, 1 Fork

Elegant tweet preprocessing

pipstrap - 9 Stars, 1 Fork

A small script that can act as a trust root for installing pip 8

pipgh - 8 Stars, 1 Fork

A tool to install python packages from Github.

Gitffiti - 7 Stars, 0 Fork

Gitffiti Repo

January 29, 2016 10:40 AM

Martin Fitzpatrick

Create Simple GUI Applications with Python and Qt

Create Simple GUI Applications will show you how to use Python and Qt to do just that.

This ebook was written to accompany the video tutorial course on Udemy, adding background and detail to the lectures, with more examples and reference documentation. However, it stands perfectly well alone as a solid introduction to programming Python GUI applications using the PyQt framework. The first chapter covers all that you need to quickly start building functional applications.

This is my first ebook, and first time publishing on the Leanpub platform - which is designed for in progress publishing. The book will continue to be extended with more material as the course expands, and these updates will remain free to everyone who has previously bought the book.

A web version of the book is available to read online for free.

Hope you find the book useful, and let me know what you think!

January 29, 2016 07:25 AM

Jan	FEB	Mar
	02
2015	2016	2017

Planet Python

February 01, 2016

CVE-2016-2048: User with "change" but not "add" permission can create objects for ModelAdmin’s with save_as=True

Affected supported versions

Resolution

General notes regarding security reporting

Bokeh and rBokeh

Spark and SparkR

Getting started with R in Anaconda

Conda, R Essentials and the R channel on Anaconda Cloud.

rBokeh: Interactive Data Visualizations in R

rBokeh and Shiny

Using Anaconda for cluster management

Setting up your cloud cluster

Notebooks and R Essentials on your cluster

Running SparkR on your cluster

Conclusion

First pipstrap.py

Then hashin

One Last Note

Actually, One More Last Note

Speaker Profile: Lorena Mesa (@loooorenanicole)

Sponsor Profile: Tennessee Data Commons (@tndatacommons)

DeprecationWarnings

By default, they are not reported!

Forcing defaults from the command line makes them reported

So how should I deprecate things?

Checking against deprecations of our dependencies

Beware of the ordering of warning filters

When

Where

Who

Debian

Mopidy

Comics

January 31, 2016

Network Timeouts From asyncio

Three Clues

Investigation

Eureka

Diagnosis

Verification

Solution

Debian Long Term Support (LTS)

phpMyAdmin and Prosody

icu and JDK vulnerabilities

OpenSSH vulnerabilities

Xscreensaver vulnerability

Other LTS work

systemd-nspawn adventures

Other free software work

Ledger timetracking

Kodi

Sopel

Feed2tweet and spigot

Git-mediawiki

Borg resignation

Summary

Brief Introduction

Interview with Aaron Meurer

Picks

Keep In Touch

Links

Quick Installation

Changes in 1.11.0

January 30, 2016

January 29, 2016

Speaker Profile: Matt George (@binarydud)

Sponsor Profile: Emma (@emmaemail)

Speaker Profile: Derik Pell

Sponsor Profile: The Iron Yard (@TheIronYard)

CVE-2016-2048: User with "change" but not "add" permission can create objects for `ModelAdmin`’s with `save_as=True`

First `pipstrap.py`

Then `hashin`

`DeprecationWarning`s