Monthly Archives: June 2014

Shumgrepper – Summary of work

In this blog, I am going to summarize my work till now. As the mid-term evaluation is going on this week, it would be better to have a look at the work done so far and task that are still left and what i have planned to do in coming days.

As per my proposal, I divided the whole project into 5 major tasks.

1. Query building for database

It involves the following tasks:

  • Setting up summershum database and enabled shumgrepper to query from it.
  • Designed basic layout  and defined directory structure of the app.
  • Creation of end-points to display files information by particular sha1sum, sha256sum, md5sum and tarsum.

2. Web – API Wrapper of the app

It involves:

  • JSON API: It returns json content if the request is made in json or request header is “application/json”.
  • Compare Packages: It returns filenames which are different in packages being compared.
  • File of a package: returns filenames of a package.

To do:

  • GPL License: find GPL license present in packages.

3. Web front-end

It involves improving the GUI of the app:

  • Created a index bar that appears on top of every page.
  • Added a function to summershum to list the names of all the packages. /packages endpoint will list all the package names which when clicked results into package information.
  • Added docs for API.

To do:

  • Design front-page of the app. A text box can be added to make /sha1sum, /sha256sum,  /tarsum, /md5sum simpler.
  • Improvements in the API doc.
  • Separation of API and UI.
  • Improvement in GUI to display filenames for /compare and /packages/{packages}/filenames.
  • Further, it requires other improvements in UI.

4. Deployment

I am currently working on its deployment. I hope it will be completed within next 2 days.

5. Integration and Testing

To do:

  • Create unit-test for the app.
  • Integration of the app. (if time permits)

 

Advertisements

Shumgrepper – Week 5

Last week i worked on improving the GUI of the app, JSON api for all the end-points,  documentation for api and few other bug fixes.

1. GUI

Here’s the top-index bar that will appear in all the pages.

Screenshot from 2014-06-24 14:26:43

 

Packages button will result in list of packages which will result into package information when a particular package is clicked.

Screenshot from 2014-06-24 10:02:43

 

 

2. Json api for all end-points.

Query made to display the files of a package.

 $ http get http://localhost:5000/package/tito/filenames

It will return all the file names present in the package.

[
    "/tito-0.5.4/wercker.yml",
    "/tito-0.5.4/titorc.5.asciidoc",
    "/tito-0.5.4/AUTHORS",
    "/tito-0.5.4/.gitignore",
    "/tito-0.5.4/.gitattributes",
    "/tito-0.5.4/test/functional/__init__.py",
    "/tito-0.5.4/test/functional/specs/extsrc.spec",
    "/tito-0.5.4/src/tito/exception.py",
    "/tito-0.5.4/src/tito/distributionbuilder.py",
    "/tito-0.5.3/test/functional/builder_tests.py",
    "/tito-0.5.3/test/functional/build_gitannex_tests.py",
    "/tito-0.5.3/src/tito/common.py",
    "/tito-0.5.3/src/tito/cli.py",
    "/tito-0.5.3/src/tito/buildparser.py",
    "/tito-0.5.3/src/tito/release/main.py",
    "/tito-0.5.3/src/tito/release/copr.py",
    "/tito-0.5.3/src/tito/release/__init__.py",
    "/tito-0.5.3/hacking/titotest-centos-6.4/Dockerfile",
    "/tito-0.5.3/hacking/titotest-centos-5.9/Dockerfile",
    "/tito-0.5.5/src/tito/tagger/zstreamtagger.py",
    "/tito-0.5.5/src/tito/tagger/rheltagger.py",
    "/tito-0.5.5/src/tito/tagger/main.py",
    "/tito-0.5.5/src/tito/tagger/__init__.py",
    "/tito-0.5.5/src/tito/release/obs.py",
    "/tito-0.5.5/src/tito/release/main.py",
    "/tito-0.5.5/src/tito/release/copr.py",
    "/tito-0.5.5/src/tito/release/__init__.py",
    "/tito-0.5.5/rel-eng/custom/custom.py",
    "/tito-0.5.5/hacking/runtests.sh",
    "/tito-0.5.5/bin/tar-fixup-stamp-comment.pl",
    "/tito-0.5.5/bin/generate-patches.pl"
]

Similarly queries can be made to compare the packages, to display information by sha1sum, sha256sum, tarsum and md5sum.

 

3. Added documentation on how to query results via API.

 

Screenshot from 2014-06-24 10:10:40

 

Shumgrepper – Week 4 update

Work done this week can be summarized as below.

1. Json output

Earlier it returns only html content. What if a user request data in json?

A user can request json content via API through http get request. The content type for this request is ‘*/*’ which can be considered as ‘application/json’. Otherwise if user wants to request through user interface, he/she will get the html content of the data. For this, first we need to find the mimetype of the request made.

mimetype = flask.request.headers.get('Accept')

I had already done this before in datagrepper project where i made a function request_wants_html which returns true if mimetype is “text/html”.

def request_wants_html():
    best = flask.request.accept_mimetypes \
        .best_match(['application/json', 'text/html', 'text/plain'])
    return best == 'text/html'

Then i had to convert the data which was file object into its json. I tried using inbuilt functions but they could not directly convert it into json. It took a lot of time to get through this problem. Then finally I serialized the data and converted it into its dict.

def JSONEncoder(messages):
    message_list = []

    for message in messages:
        message_dict = dict(
            tar_file = message.tar_file,
            md5sum = message.md5sum,
            sha256sum = message.sha256sum,
            pkg_name = message.pkg_name,
            filename = message.filename,
            tar_sum = message.tar_sum,
            sha1sum = message.sha1sum
        )
        message_list.append(message_dict)

    return message_list

 

2. Compare packages

This can be used to compare the packages and return the filenames different(or same) in two or more packages. I first approached this by creating a endpoint like this  /compare/packages/{package1}/{package2}.  By this way, i could only compare two packages. I discussed with pingou, he suggested to take input from user all the packages name and then comparing them. I read a few tutorials and found this of great help. I created a web form using Flask-WTF extension.

Screenshot from 2014-06-17 17:13:57

It returns the filenames common to all the packages. But today, pingou suggested me that user would be more interested to know the files that have changed when he/she wants to compare the different versions of the packages. Besides this, I have to create API for it.

GSOC Shumgrepper: Progress till now

Firstly, I want to apologize for being so late in updating my progress. But its better to be late than Never. As I am working on Shumgrepper, a web-app of summershum which collects the md5sum, sha1sum, sha256sum of every file present in every package. I am about to complete the 3rd week of my internship but i am going little slow and need to gear up in the coming days.

My work can be summarized as below:

Developed the basic framework of the app. Defined the directory structure of the app.
shumgrepper\
    shumgrepper\
        templates\                     //store web app templates
        app.py                         //contains definition of various end-points
        default_config.py              
    apache\                            // for installing locations
    summershum\                        //files containing methods used for querying
    fedmsg.d\                          // display fedmsg messages 
                                          in human readable format
    createddb.py
    development.cfg                     
    requirement.txt                    
    runserver.py                      //to run server
    setup.cfg
Setup database and made shumgrepper app to query the summershum database.

After having discussion about database models, i finally decided to use sqlite database in which the original database model of summershum is. I cloned summershum repository and run it to store database and enable shumgrepper to query it.

Creation of endpoints

The confusion was to create separate endpoints or create one endpoint for all. But after discussion with my mentor, we decided to have separate endpoints to make querying easier and more clean.

Till now, I have created 4 endpoints:

  • /sha1/<sha1sum>             // to query by sha1sum
  • /md5/<md5sum>              // to query by md5sum
  • /tar/<tarsum>              // to query by tarsum
  • /sha256/<sha256sum>            // to query by sha256sum
Added CSS to enable the data (files information) appear in the form of table.

Screenshot from 2014-06-07 00:41:04

What’s next?

  • Till now, it returns data in html form. If a user requests data in json format, it should return json output.
  • Adding more endpoints to return the specific information requested by user.
    e.g. If user wants to see the files name with a specific sha1sum.
  • Create front page of the app.