Pycon India 2014

Last month I attended Pycon India conference which was held in Bangalore from 26-28 September. I had different experiences there and many things happened for the first time. It was my first open source conference. First time I travelled alone to a new city. First time met Indian Fedora contributors. First time volunteered for the conference. First time gave a talk in conference. Became member of Python Software Society of India(PSSI) and Durgapur Linux User Group (DGPlug).

I reached Bangalore one day before the conference started. I went to the conference venue directly. There were workshops going on that time. There I met Kushal Das da, Ratnadeep da, Elita, Devyani and a few volunteers. I came to know that my friends Elita and Devyani were volunteering for the conference. I was also interested so i asked Chandan da who was actively involved in organising the whole event. He got agreed and I joined the team. Next we had to arrange for our accommodation. Thanks to Kushal da and Ratnadeep da for helping out in arranging everything. I stayed with my new friends Elita and Devyani.

Day 1: Though we were supposed to reach early but we got bit late. In a hurry, first we registered ourselves for the conference and had breakfast. Day begin with a keynote by Kushal Das. He shared different stories on his experiences, how he got started in open source, different people he met and how they put an impact on his life. After this we had regular talks but I didn’t like them much. There were booths for companies who sponsored for the event. I visited their booths and discussed their ideas, technology they are using. I also got goodies from these companies which was most exciting. The day ends with a lovely dinner treat given to all the volunteers.

Day 2: It started with keynote by Michael Foord. In afternoon after lunch there was meeting of all DGPlug members on the staircase where we discussed the ongoing projects, future goals and upcoming events.

15434541865_d40ae53e1f_o

I gave a small lightning talk on my project Datagrepper that i did in OPW. I barely got 10 minutes to speak but I tried to make most of those 10 minutes.

   15213325839_2cbccb10de_o

After the conference there was meeting of all PSSI members. I attended that meeting and registered for the group and became a member. Day ends with dinner with DGPlug members at a chinese restaurant. I had a great time in my first ever conference and best part was that I could meet fedora contributors whom I had only talked on irc.

Overview of Shumgrepper functionalities

We have created dev instance for shumgrepper  http://209.132.184.120/ . Till now, I have implemented the following functionalities, i am writing a small summary of it below:

UI

 

  • /    –  home page, front page, shows the user about what the application is and what can be done with it
  • /md5/<md5>     –  Returns all the files in all the packages matching this md5sum

e.g.   /md5/f4aafb270c2f983f35b365aad5fe8870

  • /sha1/<sha1>    -  Returns all the files in all the packages matching this sha1sum

e.g.   /sha1/f83618056ae5601b74a75db03739dd3ec24292f5

  • /sha256/<sha256> – Returns all the files in all the packages matching this sha256sum

e.g.  /sha256/e77b543aefd1595f159e541041a403c48a240913bc65ca5c4267df096f775eb6

  • /tar_sum/<tar_sum> – Returns all the files in all the packages matching this tar_sum

e.g. /tar_sum/4a31a53097eaf029df45dd36ab622a57

  • /packages – List all the packages
  • /package/<package> – Gives overview of the package, different versions of the package

e.g.  /package/fotoxx

  • /package/<package>/filenames – List all the filenames present in a package

e.g. /package/fotoxx/filenames

  • /tar_file/<tar_file>/filenames – List all the filenames present in a specific package version

e.g. /tar_file/fotoxx-14.05.1.tar.gz/filenames

  • /filename/<filename> – Returns all the file details matching that filename

e.g.  /filename/knot-1.5.0%2Fconfig.guess

  • /compare/common – Compare two or more tar_files, matches their sha256sum values  and returns common filenames.

e.g. /compare/common?tar_file=fedora-release-21.tar.bz2&tar_file=fedora-release-22.tar.bz2

  • /common/difference – Compare two or more tar_files, matches their sha256sum values and returns different filenames i.e. filenames that are not common to all the tar_files

e.g. /compare/difference?tar_file=fedora-release-21.tar.bz2&tar_file=fedora-release-22.tar.bz2

  • /history/<package> – Return history of the package with evolution across all its releases. It shows the files that have changed or the files that have remained same   as compared to its previous version. It has not been implemented yet.

 


 

API

 

It returns output in json format and request can be made by http get, curl or wget.

  • /api – Contains documentation for API
  • /api/sha1/<sha1sum> – Returns all the files in all the packages matching this sha1sum.

e.g. /api/sha1/e0ec2c54e7a4fabb2f7e8c78d711efc0ed5f4f43

  • /api/sha256/<sha256sum> – Returns all the files in all the packages matching this sha256sum

e.g./api/sha256/e77b543aefd1595f159e541041a403c48a240913bc65ca5c4267df096f775eb6

  • /api/md5/<md5sum> – Returns all the files in all the packages matching this md5sum

e.g.  /api/md5/bf6f8d7c7022b27534011c4ad8334e2a

  • /api/tar_sum/<tar_sum> – Returns all the files in all the packages matching this tar_sum

e.g. /api/tar_sum/4a31a53097eaf029df45dd36ab622a57

  • /api/package/<package> – Returns all the package details with all the versions available for this package.

e.g.  /api/package/fotoxx

  • /api/package/<package>/filenames – Returns all the filenames present in this package.

e.g.  /api/package/felix-gogo-command/filenames

  • /api/tar_file/<tar_file>/filenames – Returns all the filenames present in a specific version of the package i.e. the tar_file given by user.

e.g.    /api/tar_file/fedora-release-22.tar.bz2/filenames

  • /api/compare/package/common – Compare two or more packages, match their sha256sum  and returns the filenames common to all the packages.

e.g.   /api/compare/package/common \
package==ark \
package==baloo

  • /api/compare/package/difference – Compare two or more packages, match their sha256sum values and returns filenames which are not common to all the packages.

e.g.   /api/compare/package/difference \
package==kamera \
package==fedora-release

  • /api/compare/tar_file/common – If you want to compare two versions of same package, then comparing them by package name won’t work.  It will compare two or more tar_files, match their sha256sum and returns filenames common to all the tar_files.

e.g.    /api/compare/tar_file/common \
tar_file==fedora-release-21.tar.bz2 \
tar_file==fedora-release-22.tar.bz2

  • /api/compare/tar_file/difference – Compare two or more tar_files and return filenames uncommon to all the tar_files.

e.g.    /api/compare/tar_file/difference \
tar_file==fedora-release-21.tar.bz2 \
tar_file==fedora-release-22.tar.bz2

Shumgrepper – Summary of work

In this blog, I am going to summarize my work till now. As the mid-term evaluation is going on this week, it would be better to have a look at the work done so far and task that are still left and what i have planned to do in coming days.

As per my proposal, I divided the whole project into 5 major tasks.

1. Query building for database

It involves the following tasks:

  • Setting up summershum database and enabled shumgrepper to query from it.
  • Designed basic layout  and defined directory structure of the app.
  • Creation of end-points to display files information by particular sha1sum, sha256sum, md5sum and tarsum.

2. Web – API Wrapper of the app

It involves:

  • JSON API: It returns json content if the request is made in json or request header is “application/json”.
  • Compare Packages: It returns filenames which are different in packages being compared.
  • File of a package: returns filenames of a package.

To do:

  • GPL License: find GPL license present in packages.

3. Web front-end

It involves improving the GUI of the app:

  • Created a index bar that appears on top of every page.
  • Added a function to summershum to list the names of all the packages. /packages endpoint will list all the package names which when clicked results into package information.
  • Added docs for API.

To do:

  • Design front-page of the app. A text box can be added to make /sha1sum, /sha256sum,  /tarsum, /md5sum simpler.
  • Improvements in the API doc.
  • Separation of API and UI.
  • Improvement in GUI to display filenames for /compare and /packages/{packages}/filenames.
  • Further, it requires other improvements in UI.

4. Deployment

I am currently working on its deployment. I hope it will be completed within next 2 days.

5. Integration and Testing

To do:

  • Create unit-test for the app.
  • Integration of the app. (if time permits)

 

Shumgrepper – Week 5

Last week i worked on improving the GUI of the app, JSON api for all the end-points,  documentation for api and few other bug fixes.

1. GUI

Here’s the top-index bar that will appear in all the pages.

Screenshot from 2014-06-24 14:26:43

 

Packages button will result in list of packages which will result into package information when a particular package is clicked.

Screenshot from 2014-06-24 10:02:43

 

 

2. Json api for all end-points.

Query made to display the files of a package.

 $ http get http://localhost:5000/package/tito/filenames

It will return all the file names present in the package.

[
    "/tito-0.5.4/wercker.yml",
    "/tito-0.5.4/titorc.5.asciidoc",
    "/tito-0.5.4/AUTHORS",
    "/tito-0.5.4/.gitignore",
    "/tito-0.5.4/.gitattributes",
    "/tito-0.5.4/test/functional/__init__.py",
    "/tito-0.5.4/test/functional/specs/extsrc.spec",
    "/tito-0.5.4/src/tito/exception.py",
    "/tito-0.5.4/src/tito/distributionbuilder.py",
    "/tito-0.5.3/test/functional/builder_tests.py",
    "/tito-0.5.3/test/functional/build_gitannex_tests.py",
    "/tito-0.5.3/src/tito/common.py",
    "/tito-0.5.3/src/tito/cli.py",
    "/tito-0.5.3/src/tito/buildparser.py",
    "/tito-0.5.3/src/tito/release/main.py",
    "/tito-0.5.3/src/tito/release/copr.py",
    "/tito-0.5.3/src/tito/release/__init__.py",
    "/tito-0.5.3/hacking/titotest-centos-6.4/Dockerfile",
    "/tito-0.5.3/hacking/titotest-centos-5.9/Dockerfile",
    "/tito-0.5.5/src/tito/tagger/zstreamtagger.py",
    "/tito-0.5.5/src/tito/tagger/rheltagger.py",
    "/tito-0.5.5/src/tito/tagger/main.py",
    "/tito-0.5.5/src/tito/tagger/__init__.py",
    "/tito-0.5.5/src/tito/release/obs.py",
    "/tito-0.5.5/src/tito/release/main.py",
    "/tito-0.5.5/src/tito/release/copr.py",
    "/tito-0.5.5/src/tito/release/__init__.py",
    "/tito-0.5.5/rel-eng/custom/custom.py",
    "/tito-0.5.5/hacking/runtests.sh",
    "/tito-0.5.5/bin/tar-fixup-stamp-comment.pl",
    "/tito-0.5.5/bin/generate-patches.pl"
]

Similarly queries can be made to compare the packages, to display information by sha1sum, sha256sum, tarsum and md5sum.

 

3. Added documentation on how to query results via API.

 

Screenshot from 2014-06-24 10:10:40

 

Shumgrepper – Week 4 update

Work done this week can be summarized as below.

1. Json output

Earlier it returns only html content. What if a user request data in json?

A user can request json content via API through http get request. The content type for this request is ‘*/*’ which can be considered as ‘application/json’. Otherwise if user wants to request through user interface, he/she will get the html content of the data. For this, first we need to find the mimetype of the request made.

mimetype = flask.request.headers.get('Accept')

I had already done this before in datagrepper project where i made a function request_wants_html which returns true if mimetype is “text/html”.

def request_wants_html():
    best = flask.request.accept_mimetypes \
        .best_match(['application/json', 'text/html', 'text/plain'])
    return best == 'text/html'

Then i had to convert the data which was file object into its json. I tried using inbuilt functions but they could not directly convert it into json. It took a lot of time to get through this problem. Then finally I serialized the data and converted it into its dict.

def JSONEncoder(messages):
    message_list = []

    for message in messages:
        message_dict = dict(
            tar_file = message.tar_file,
            md5sum = message.md5sum,
            sha256sum = message.sha256sum,
            pkg_name = message.pkg_name,
            filename = message.filename,
            tar_sum = message.tar_sum,
            sha1sum = message.sha1sum
        )
        message_list.append(message_dict)

    return message_list

 

2. Compare packages

This can be used to compare the packages and return the filenames different(or same) in two or more packages. I first approached this by creating a endpoint like this  /compare/packages/{package1}/{package2}.  By this way, i could only compare two packages. I discussed with pingou, he suggested to take input from user all the packages name and then comparing them. I read a few tutorials and found this of great help. I created a web form using Flask-WTF extension.

Screenshot from 2014-06-17 17:13:57

It returns the filenames common to all the packages. But today, pingou suggested me that user would be more interested to know the files that have changed when he/she wants to compare the different versions of the packages. Besides this, I have to create API for it.

GSOC Shumgrepper: Progress till now

Firstly, I want to apologize for being so late in updating my progress. But its better to be late than Never. As I am working on Shumgrepper, a web-app of summershum which collects the md5sum, sha1sum, sha256sum of every file present in every package. I am about to complete the 3rd week of my internship but i am going little slow and need to gear up in the coming days.

My work can be summarized as below:

Developed the basic framework of the app. Defined the directory structure of the app.
shumgrepper\
    shumgrepper\
        templates\                     //store web app templates
        app.py                         //contains definition of various end-points
        default_config.py              
    apache\                            // for installing locations
    summershum\                        //files containing methods used for querying
    fedmsg.d\                          // display fedmsg messages 
                                          in human readable format
    createddb.py
    development.cfg                     
    requirement.txt                    
    runserver.py                      //to run server
    setup.cfg
Setup database and made shumgrepper app to query the summershum database.

After having discussion about database models, i finally decided to use sqlite database in which the original database model of summershum is. I cloned summershum repository and run it to store database and enable shumgrepper to query it.

Creation of endpoints

The confusion was to create separate endpoints or create one endpoint for all. But after discussion with my mentor, we decided to have separate endpoints to make querying easier and more clean.

Till now, I have created 4 endpoints:

  • /sha1/<sha1sum>             // to query by sha1sum
  • /md5/<md5sum>              // to query by md5sum
  • /tar/<tarsum>              // to query by tarsum
  • /sha256/<sha256sum>            // to query by sha256sum
Added CSS to enable the data (files information) appear in the form of table.

Screenshot from 2014-06-07 00:41:04

What’s next?

  • Till now, it returns data in html form. If a user requests data in json format, it should return json output.
  • Adding more endpoints to return the specific information requested by user.
    e.g. If user wants to see the files name with a specific sha1sum.
  • Create front page of the app.

Mediawiki-vagrant is tough to be tamed in your local machine

I am a newbie in php and my next task i have to build a mediawiki extension that will embed the datagrepper messages in wiki user profile. Even setting up the mediawiki environment became a tough job for me. But I have managed to do so and now i am going to share the installation steps.

How to install Mediawiki-vagrant?

1. Install git
2.  Get virtualbox

First install the dependencies:

$ sudo yum update

$ sudo yum install binutils qt gcc make patch libgomp glibc-headers glibc-devel kernel-headers kernel-devel dkms

Now, install VirtualBox:

$ sudo yum install VirtualBox-4.3

3. Download the latest version of vagrant.

4.  Get the code and create your machine:

$ git clone https://gerrit.wikimedia.org/r/mediawiki/vagrant
$ cd vagrant
$ vagrant up
[charul@localhost vagrant]$ vagrant up
Bringing machine 'default' up with 'virtualbox' provider...
[default] Clearing any previously set forwarded ports...
[default] Clearing any previously set network interfaces...
[default] Preparing network interfaces based on configuration...
[default] Forwarding ports...
[default] -- 22 => 2222 (adapter 1)
[default] -- 80 => 8080 (adapter 1)
[default] Running 'pre-boot' VM customizations...
[default] Booting VM...
[default] Waiting for machine to boot. This may take a few minutes...
[default] Machine booted and ready!
[default] Configuring proxy for Apt...
[default] Configuring proxy environment variables...
[default] Configuring proxy for PEAR...
GuestAdditions 4.3.6 running --- OK.
[default] Setting hostname...
[default] Configuring and enabling network interfaces...
[default] Mounting shared folders...
[default] -- /vagrant
[default] -- /vagrant/logs
[default] -- /tmp/vagrant-puppet-1/manifests
[default] -- /tmp/vagrant-puppet-1/modules-0
[default] VM already provisioned. Run `vagrant provision` or use `--provision` to force it

As there is a proxy behind my internet connection, i  had to install  vagrant-proxyconf plugin.

$ vagrant plugin install vagrant-proxyconf

Then i configured proxy by adding the following lines to $HOME/.vagrant.d/Vagrantfile (or to the project Vagrantfile)

Vagrant.configure('2') do |config|
if Vagrant.has_plugin?("vagrant-proxyconf")
    config.proxy.http     = "http://username:password@proxyhost:port/"
    config.proxy.https    = "http://username:password@proxyhost:port/"
    config.proxy.no_proxy = "localhost,127.0.0.1,.example.com"
end
# ... other stuff
end

After this,

$ vagrant provision

For the first time I got this output which wasn’t correct. :(

[charul@localhost vagrant]$ vagrant provision
[default] Configuring proxy for Apt...
[default] Configuring proxy environment variables...
[default] Configuring proxy for PEAR...
[default] Running provisioner: puppet...
Running Puppet with site.pp...
info: Applying configuration version '1392325515.9dbe0e1'
info: mount[files]: allowing mediawiki-vagrant access
notice: /Stage[main]/Mediawiki/File[/vagrant/settings.d]/mode: mode changed '0775' to '0755'
err: /Stage[main]/Redis::Php/Package[php5-redis]/ensure: change from purged to present failed: Execution of '/usr/bin/apt-get -q -y -o DPkg::Options::=--force-confold install php5-redis' returned 100: Reading package lists...
Building dependency tree...
Reading state information...
E: Unable to locate package php5-redis
err: /Stage[main]/Mediawiki::Phpsh/Package[phpsh]/ensure: change from absent to 1.3.3 failed: Could not update: Execution of '/usr/bin/pip install -q phpsh==1.3.3' returned 1:   Cannot fetch index base URL http://pypi.python.org/simple/
Could not find any downloads that satisfy the requirement phpsh==1.3.3
No distributions at all found for phpsh==1.3.3
Storing complete log in /root/.pip/pip.log
at /tmp/vagrant-puppet-1/modules-0/mediawiki/manifests/phpsh.pp:14
notice: /Stage[main]/Mediawiki::Phpsh/File[/etc/phpsh/rc.php]: Dependency Package[phpsh] has failures: true
warning: /Stage[main]/Mediawiki::Phpsh/File[/etc/phpsh/rc.php]: Skipping because of failed dependencies
notice: /Stage[main]/Apache/Service[apache2]: Dependency Package[php5-redis] has failures: true
warning: /Stage[main]/Apache/Service[apache2]: Skipping because of failed dependencies
notice: /Stage[main]/Mediawiki/Exec[check settings]/returns: executed successfully
info: /Stage[main]/Mediawiki/Exec[check settings]: Scheduling refresh of Exec[mediawiki setup]
notice: /Stage[main]/Mediawiki/Exec[mediawiki setup]/returns: executed successfully
notice: /Stage[main]/Mediawiki/Exec[mediawiki setup]: Triggered 'refresh' from 1 events
notice: /Stage[main]/Mediawiki/Exec[configure phpunit]/returns: executed successfully
notice: /Stage[main]/Mediawiki/File[/usr/local/bin/run-mediawiki-tests]/ensure: defined content as '{md5}6b154f733a6f77b5728e480b2714fe72'
notice: /Stage[main]/Mediawiki/Exec[require extra settings]/returns: executed successfully
err: /Stage[main]/Git/Package[git-review]/ensure: change from absent to 1.23 failed: Could not update: Execution of '/usr/bin/pip install -q git-review==1.23' returned 1:   Cannot fetch index base URL http://pypi.python.org/simple/
Could not find any downloads that satisfy the requirement git-review==1.23
No distributions at all found for git-review==1.23
Storing complete log in /root/.pip/pip.log
at /tmp/vagrant-puppet-1/modules-0/git/manifests/init.pp:32
notice: Finished catalog run in 69.08 seconds
The following SSH command responded with a non-zero exit status.
Vagrant assumes that this means the command failed!

FACTER_fqdn='mediawiki-vagrant.dev' FACTER_forwarded_port='8080' FACTER_shared_apt_cache='/vagrant/apt-cache/' FACTER_provider_name='virtualbox' FACTER_provider_version='4.3.6' puppet apply --templatedir /vagrant/puppet/templates --verbose --config_version /vagrant/puppet/extra/config-version --fileserverconfig /vagrant/puppet/extra/fileserver.conf --logdest /vagrant/logs/puppet/puppet.9dbe0e134.log --logdest console --modulepath '/tmp/vagrant-puppet-1/modules-0' --manifestdir /tmp/vagrant-puppet-1/manifests --detailed-exitcodes /tmp/vagrant-puppet-1/manifests/site.pp || [ $? -eq 2 ]

Stdout from the command:

info: Applying configuration version '1392325515.9dbe0e1'
info: mount[files]: allowing mediawiki-vagrant access
notice: /Stage[main]/Mediawiki/File[/vagrant/settings.d]/mode: mode changed '0775' to '0755'
err: /Stage[main]/Redis::Php/Package[php5-redis]/ensure: change from purged to present failed: Execution of '/usr/bin/apt-get -q -y -o DPkg::Options::=--force-confold install php5-redis' returned 100: Reading package lists...
Building dependency tree...
Reading state information...
E: Unable to locate package php5-redis

err: /Stage[main]/Mediawiki::Phpsh/Package[phpsh]/ensure: change from absent to 1.3.3 failed: Could not update: Execution of '/usr/bin/pip install -q phpsh==1.3.3' returned 1:   Cannot fetch index base URL http://pypi.python.org/simple/
Could not find any downloads that satisfy the requirement phpsh==1.3.3
No distributions at all found for phpsh==1.3.3
Storing complete log in /root/.pip/pip.log
at /tmp/vagrant-puppet-1/modules-0/mediawiki/manifests/phpsh.pp:14
notice: /Stage[main]/Mediawiki::Phpsh/File[/etc/phpsh/rc.php]: Dependency Package[phpsh] has failures: true
warning: /Stage[main]/Mediawiki::Phpsh/File[/etc/phpsh/rc.php]: Skipping because of failed dependencies
notice: /Stage[main]/Apache/Service[apache2]: Dependency Package[php5-redis] has failures: true
warning: /Stage[main]/Apache/Service[apache2]: Skipping because of failed dependencies
notice: /Stage[main]/Mediawiki/Exec[check settings]/returns: executed successfully
info: /Stage[main]/Mediawiki/Exec[check settings]: Scheduling refresh of Exec[mediawiki setup]
notice: /Stage[main]/Mediawiki/Exec[mediawiki setup]/returns: executed successfully
notice: /Stage[main]/Mediawiki/Exec[mediawiki setup]: Triggered 'refresh' from 1 events
notice: /Stage[main]/Mediawiki/Exec[configure phpunit]/returns: executed successfully
notice: /Stage[main]/Mediawiki/File[/usr/local/bin/run-mediawiki-tests]/ensure: defined content as '{md5}6b154f733a6f77b5728e480b2714fe72'
notice: /Stage[main]/Mediawiki/Exec[require extra settings]/returns: executed successfully
err: /Stage[main]/Git/Package[git-review]/ensure: change from absent to 1.23 failed: Could not update: Execution of '/usr/bin/pip install -q git-review==1.23' returned 1:   Cannot fetch index base URL http://pypi.python.org/simple/
Could not find any downloads that satisfy the requirement git-review==1.23
No distributions at all found for git-review==1.23
Storing complete log in /root/.pip/pip.log
at /tmp/vagrant-puppet-1/modules-0/git/manifests/init.pp:32
notice: Finished catalog run in 69.08 seconds

Stderr from the command:

This was the time when i got really frustrated. Even after repeated number of trials i couldn’t succeed. This was mainly because of the proxy and also many websites are blocked in my college. It failed to download few dependencies like phpsh, php5-redis etc. I manually installed them and but ‘vagrant provision’ command gave me the same error everytime. So i was left with one option i.e to install them without proxy connection. I did it and it was successfully installed. This gave me a momentarily relief but the real problems were yet to arrive. Well, i will discuss about them in the next blog. For now I am feeling accomplished after installing this.

So, the correct result that should be displayed on terminal after ‘vagrant provision’ is:

[charul@localhost vagrant]$ vagrant provision
[default] Running provisioner: puppet...
Running Puppet with site.pp...
info: Applying configuration version '1392886401.15b24e4'
info: mount[files]: allowing mediawiki-vagrant access
notice: /Stage[main]/Mediawiki/File[/vagrant/settings.d]/mode: mode changed '0775' to '0755'
notice: Finished catalog run in 5.91 seconds

Now to go to mediawiki main page, go to http://127.0.0.1:8080/wiki/Main_Page.

Screenshot from 2014-02-20 14:34:36

In order to remotely login mediawiki-vagrant from terminal:

 $ vagrant ssh

Screenshot from 2014-02-14 02:47:23

I think this looks cool :)