Managing and Building version-controlled Maven Repos using Git, Gradle and Nexus Server

I currently work in a VERY OLD code base that uses a thirdparty directory as a version-controlled “library” directory, dated from the era before Maven. Some colleagues decided it was time to adopt Maven for building new components and that was exciting… However, you can imagine having a build system composed of “dino” ANT scripts to manage really old stuff, and then the introduction of Maven pom.xml.

I personally chose to use Gradle for the projects that I had started and integrated it well to publish generated jars to the “thirdparty” directory…

http://stackoverflow.com/questions/7826652/how-to-upload-an-existing-collection-of-3rd-party-jars-to-a-maven-server-in-grad

Gradle for Maven Dependencies

The most updated version of this script is shown below. Note that this approach differs from the one in the stackoverflow in that this version DOES NOT generate the Maven Metadata files (pom.xml, *sha…), but simply copies the generated versioned jars into the “thirdparty” directory (scm cached dir).

Then, I used Gradle’s dependencies mechanism to use the same thirdparty directory as a dependency…

http://gradle.1045684.n5.nabble.com/Javac-Java-Annotation-Processor-Maven-classpath-dependencies-in-Gradle-tp4626751p4633029.html

This week I received a surprising email from a developer complaining about not being able to build one of the components. The problem was simply because the Maven repository server was COMPLETELY REBUILT and all the dependencies were wiped-out. As a result, developers maintaining different projects needed to upload the projects again. That triggered something the question where the Maven artifacts should be stored. As the old approach of using the “thirdparty” directory might become a hybrid approach in a future, I thought I could use GIT to store the versions of the maven versions on a version-controlled directory of my projects after I stumbled upon the following blog

http://cemerick.com/2010/08/24/hosting-maven-repos-on-github/

I agree with the pros/cons about that approach, and that happens to be a very similar situation I faced today. In this way, I decided to create something similar using BitBucket private’s GIT project to simulate my same environment.

Reusable Gradle Properties

First the initial maven configuration is managed by the following:

This gives me the following capabilities:

  • Using the property “-Prelease” as a switch to not use the -SNAPSHOT prefix in the generated jars.
  • Lots of properties saved in the project, making it easier to create scripts that depending on those properties from other build scripts.
  • Could also provide an option to generate jars to the “thidparty” directory WITHOUT the Maven’s Metadata files.

For instance, given a project “Maceio Finance” on a project repository on BitBucket, I have created the following build.gradle file to build, generate the jars, declare dependencies to the local scm repository, and able to upload new versions to that.

The simplicity of Gradle uploadArchives

Running the task “uploadArtifacts” with the different switches result in the same expected behavior as described by the blog on http://cemerick.com/2010/08/24/hosting-maven-repos-on-github/, as shown in the output below.

The generated directories with the Maven Metadata files is shown below:

Now, if you just want the same Jars to a given version-controlled directory WITHOUT the Metadata file, you can use a task to “installThirdparty” described above.

Finally, the other support needed is a Maven server. We use the open-source Nexus server. Here’s the example of uploading the same contents from the configuration. Note that it uses the previous definitions of the properties from mavenProperties.gradle.

Gotta love gradle properties

Just as a reference, the output of the command “gradle properties” is a good reference to see ALL the projects variables at runtime. Here’s the output of that command. Take a look at the property “projectMvn” just as a reference.

Cheers!

Running EMMA Code / Test Coverage with JAVA 7 and Gradle 1.0M9

Had a huge road block while trying to integrate EMMA Code Coverage with the latest version of Gradle today… I found this blog post that helped starting reaching a solution…

http://www.breskeby.com/2010/04/add-emma-code-coverage-reporting-to-your-gradle-build/

It has been 2 years since the last update (although some users have commented other solutions), but as Gradle DSL and Java are still evolving, that version did not work. One common problem related to Java 7′s new bytecode instruction. EMMA, Cobertura and others are facing the same problems discussed at http://stackoverflow.com/questions/7010665/testng-emma-cobertura-coverage-and-jdk-7-result-in-classformaterror-and-verif. Just adding the jvmArg “-XX:-UseSplitVerifier” solved the problem for me with EMMA.

Well, after spending a few hours trying to get the previous patch working, I got a working solution running Test Coverage with EMMA on the latest Grails with Java 7.

First, add the emma configuration and its dependency.

configurations{
  emma
}

dependencies {
  // EMMS Code Coverage
  emma "emma:emma:2.1.5320"
  emma "emma:emma_ant:2.1.5320"
  ...
  testCompile group: 'junit', name: 'junit', version: '4.9'
}

Then, update the test task by adding the doFirst{} and doLast{} closures below.

test {
    // add EMMA related JVM args to our tests
    jvmArgs "-XX:-UseSplitVerifier", "-Demma.coverage.out.file=$buildDir/tmp/emma/metadata.emma", "-Demma.coverage.out.merge=true"

    doFirst {
       println "Instrumenting the classes at " + sourceSets.main.output.classesDir.absolutePath
       // define the custom EMMA ant tasks
       ant.taskdef( resource:"emma_ant.properties", classpath: configurations.emma.asPath)

       ant.path(id:"run.classpath") {
          pathelement(location:sourceSets.main.output.classesDir.absolutePath)
       }
       def emmaInstDir = new File(sourceSets.main.output.classesDir.parentFile.parentFile, "tmp/emma/instr")
       emmaInstDir.mkdirs()
       println "Creating $emmaInstDir to instrument from " +       sourceSets.main.output.classesDir.absolutePath
       // instruct our compiled classes and store them at $buildDir/tmp/emma/instr
       ant.emma(enabled: 'true', verbosity:'info'){
          instr(merge:"true", destdir: emmaInstDir.absolutePath, instrpathref:"run.classpath",
                metadatafile: new File(emmaInstDir, '/metadata.emma').absolutePath) {
             instrpath {
             fileset(dir:sourceSets.main.output.classesDir.absolutePath, includes:"**/*.class")
             }
          }
       }
       setClasspath(files("$buildDir/tmp/emma/instr") + configurations.emma +    getClasspath())
    }

    // The report should be generated directly after the tests are done.
    // We create three types (txt, html, xml) of reports here. Running your build script now should
    // result in output like that:
    doLast {
       def srcDir = sourceSets.main.java.srcDirs.toArray()[0]
       println "Creating test coverage reports for classes " + srcDir
       def emmaInstDir = new File(sourceSets.main.output.classesDir.parentFile.parentFile, "tmp/emma")
       ant.emma(enabled:"true"){
          new File("$buildDir/reports/emma").mkdirs()
          report(sourcepath: srcDir){
             fileset(dir: emmaInstDir.absolutePath){
                include(name:"**/*.emma")
             }
             txt(outfile:"$buildDir/reports/emma/coverage.txt")
             html(outfile:"$buildDir/reports/emma/coverage.html")
             xml(outfile:"$buildDir/reports/emma/coverage.xml")
          }
       }
       println "Test coverage reports available at $buildDir/reports/emma."
       println "txt: $buildDir/reports/emma/coverage.txt"
       println "Test $buildDir/reports/emma/coverage.html"
       println "Test $buildDir/reports/emma/coverage.xml"
    }
}

You can run the updates from your gradle as follows:

marcello@hawaii:/u1/development/workspaces/open-source/interviews/vmware$ gradle test
:compileJava
:processResources UP-TO-DATE
:classes
:compileTestJava
:processTestResources UP-TO-DATE
:testClasses
:test
Instrumenting the classes at /u1/development/workspaces/open-source/interviews/vmware/build/classes/main
Creating /u1/development/workspaces/open-source/interviews/vmware/build/tmp/emma/instr to instrument from /u1/development/workspaces/open-source/interviews/vmware/build/classes/main
Creating test coverage reports for classes /u1/development/workspaces/open-source/interviews/vmware/src/main/java
Test coverage reports available at /u1/development/workspaces/open-source/interviews/vmware/build/reports/emma.
txt: /u1/development/workspaces/open-source/interviews/vmware/build/reports/emma/coverage.txt
Test /u1/development/workspaces/open-source/interviews/vmware/build/reports/emma/coverage.html
Test /u1/development/workspaces/open-source/interviews/vmware/build/reports/emma/coverage.xml

BUILD SUCCESSFUL

Building a Developer’s Social Network…

I’ve been interested in blogging for a long time as I’ve created this blog to publish my personal stuff… I had focused 2 years of my undergraduate Computer Science studies (2000-2002) developing Graw as a research tool to support collaborative learning environment for scholars. This was triggered when I started thinking about how could I take advantage of twitter and/or FaceBook to reach out other developers. In addition, I was personally looking for ways to organize my Computer Science life from school and open-source contributions I had developed in the past and others I like to work, etc.

Today I see a developer’s Social Network as his/her image and presence in a given community of interest. For instance, BlackBeltFactory, the old JavaBlackBelt, is a community for Java developers looking for certification-level training in Java skills, among others. Similarly, the developers at GitHub and Ohloh are focused in bringing developers together from a social coding perspective, or open-source contributions, etc. Finally, I see LinkedIn as the professional visibility of developer/engineer/researcher/scientist as they provide groups of interest, etc. As I started reading more and more about this subject, I started seeing this as a way to help a developer use the power of the Internet to grow his/her social network among the professionals of a given community.

Tools

Here are a list of tools I found to help developers build their social network using their own website. WordPress is great for starting your own blog. You can download an open-source version of WordPress and maintain it yourself. Other tools that can be used to help a developer spread his/her news are:

  • Blogging: Blogging is by far the simplest way to start. Start by writing solutions to common problems, or a solution to a problem you haven’t found online. When you have more time, do a “Code Garage Sale” and have code snippets from previous projects developed in the past, say in grad school or another open-source project. That can be used as a future reference to yourself and others who face or will face the same problem. Blooging helps developers to write long emails to answer regular questions from other users, in addition to helping him/her to develop his/her writing skills. Consider not only writing blog posts, but also commenting on the ones you are interested about. You can later learn about your audience by using Google Statistics and Feedburn for statistical analysis of your audience.
  • Social Bookmarks: delicious.com give developers an account to add bookmarks based on tags. While working in a project, I tend to add links that solves a given problem or a reference to something I have done. In my opinion, other users can take advantage of what you have tagged and see the links you were working to solve a given problem. I think this can be used to find other developers based on the links you share in a social bookmark, while other developers or peers in a project can have a view of the references you’ve used. Just think about integrating a new team member to a project and share all the research references. For scientific papers, I use MyACM for research papers at on ACM and I constantly bookmark references for a given paper I’m writing.
  • Open-source Identities: for those who contribute with Open-Source projects, you probably have heard about Ohloh.net. It maintains a cache of popular open-source repositories and help users identify themselves based on the commit messages. This is a great tool to find who are the developers behind open-source projects based on the commits to the repository.  GitHub and GoogleCode provide a feature that allows you to create a link to a specific code line, allowing developers to give more specific and direct examples. Why not subscribing to someone’s commits?
  • Question-Answer Communities: what do developers do when they need to fix a given problem? What if you need an answer to a given problem that is specific to the domain of knowledge of the problem? Yes we can google for specific key words, but it is more effective to make a presence at StackOverflow. A blog can seen as a developer’s point of view about a solution to a problem, taking control over the comments, etc. On the other hand, others may prefer Wiki pages, whose content is managed by a community of users. StackOverflow’s model revolves around the idea of community supported answers sorted by correctness and votes, where the “most optimum answer” is displayed on top. It uses the Digg’s approach to “Up” and “Down” a given answer. This type of community can  help building one’s reputation in the community since it is a human desire. Have you heard of “Jon Skeet“? 

  • Mobile Support: put a developer on a train ride and a mobile device will be his/her best friend. I’ve been using my Android phone more often to scan QR image for a link to a website, a Google Map, Contact Information, etc on a webpage. I usually use the Google Chart API to generate my QR images for a given purpose. Use you barcode reader and you will automatically open this post’s webpage for future reading.
  • Broadcast Yourself: Use twitter to allow others to see what you are working at or connect with a given community. This is a very useful way to reach out to a great number of developers at the same time. As we see today, blogs allow us to tweet blog posts and that’s just the beginning. 

  • Syndicate your blog: Allow RSS Feeds so that people can subscribe to a given developer’s blog.
  • How can social networking sites make you a better developer?

Some more information on social networking for developers in the links below.

http://channel9.msdn.com/blogs/glucose/hanselminutes-on-9-social-networking-for-developers-part-2-make-your-blog-suck-less
http://channel9.msdn.com/Blogs/Glucose/Hanselminutes-on-9-Social-Networking-for-Developers-Part-1-Every-Developer-Needs-a-Blog

Categories: Open-source

Writing Functional Tests on Groovy on Grails: Experiences from CollabNet Subversion Edge.

October 30, 2010 2 comments

I first wrote this technical document for the open-source project CollabNet Subversion Edge on how to design and implement Functional Tests using Groovy on Grails. Therefore, this documentation can also be reached at “https://ctf.open.collab.net/sf/wiki/do/viewPage/projects.svnedge/wiki/FunctionalTests

Introduction and Setup

The CollabNet Subversion Edge Functional Tests are based on the Groovy on Grails plugin “Functional Tests“. But, before you get started with them, make sure you have covered the following required steps:

Besides Unit and Integration tests used during development, the source-code under development already contains the functional tests plugin support and some developed classes, as shown in the Eclipse view “Project Explorer”. In the file system, the files are located at CSVN_DEV/test/functional, where the directory CSVN_DEV is where you have previously checked out the source-code. Those set of test cases are the last one being run in our internal Continuous Integration server (Hudson) and it’s usually a good place to find bugs related to the user-facing features during development.

svnedge-functional-tests-view-eclipse.png

Functional Tests Basics

This section covers the basics of the functional tests infranstruction on Subversion Edge and assumes you are already familiar with the Grails Functional Tests plugin documentation. The plugin is already installed in the Subversion Edge development project, as you can use the commands to run a functional test and visualize the test results and report. The test cases are run as RESTful calls to the controllers defined by the application, but it can also use a URL. For instance:

After the execution of an HTTP method wrapper such as “get()” or “post()”, any test case has the access to the response object “this.response” with the HTML code payload. Grails uses this object to execute any of the “assert*” methods documented.

Another important piece of configuration is the CSVN_DEV/grails-app/conf/Config.groovy. Althoug Grails grails uses the closure “environment.test”, SvnEdge uses the general closure “svnedge” during development and test phases and, therefore, values from configuration of that closure are accessable from the test classes.

Functional tests infrastructure

Considering you have your development infrastructure set up, you will find the current implementation of functional tests at the directory “CSVN_DEV/tests/functional”. Notice that the directory structure follows the Java convention for declaring packages, and has already been configured to be included as source-code directories in the current Eclipse configuration artifact “.classpath” on trunk.

csvn-functional-tests-packages.png

We have created the following convention for defining the packages and functional test classes:

  • com.collabnet.svnedge

The package containing major abstract classes to embrace code reuse while aggregating reusable patterns throughout the entire Test infrastructure. The reusable utility methods were extracted during the first iteration of the development of the SvnEdge functional tests. For instance, the access to the configuration keys from “CSVN_DEV/grails-app/conf/Config.groovy” can be easily accessed from the test cases using the method “getConfig()” or just “config”. Similarly, the access to the i18n keys can be performed by calling “getMessage(‘key’)”, as the value of “key” is one of the keys located at the “messages.properties”, which renders strings displayed in the User Interface. Note that the English version of the i18n messages are used in the functional tests. Moreover, the abstract classes have their own intent for the scenarios found on Subversion Edge functionalities:

  1. AdminLoggedInAbstractSvnEdgeFunctionalTests: test class that sets up the test case with the user “admin” already logged in (“/csvn/status/index”).
  2. LoggedOffAbstractSvnEdgeFunctionalTests: test class that starts the application in the index auth page where a user can login (“/csvn/auth/index”).
  • com.collabnet.svnedge.console

The test cases related to the web console, or Subversion Edge itself. Different components must have its own package. For instance, take the name of the controllers to as the name of the component to be tested such as “user” and “repo”, as they should have their own test packages as “com.collabnet.svnedge.console.ui.user” and “com.collabnet.svnedge.console.ui.repo”, respectively. The only classes implemented at this time are the login

  • com.collabnet.svnedge.teamforge

The test cases related to the teamforge integration. As you will see, there are only one abstract class and two functional classes covering the functional tests of the conversion process when the server has repositories to be imported (Full Conversion) and when the server does not have any repository created (Fresh Conversion). The latter case is a bit tricky as the SvnEdge environment defines a fresh conversion when its database does not have any repository defined. In this way, test cases related to repositories need to make sure to “discover” repositories if the intent is to verify the existence of repositories in the file-system.

Running Functional Tests

As described in Grails functional tests “mini bible”, the only thing needed to run a functional test case is the following command under the directory CSVN_DEV:

grails test-app -functional OPTIONAL_CLASS_NAME

The command will start the current version of Grails installed using the functional tests environment. If you don’t provide the optional parameter “OPTIONAL_CLASS_NAME”, grails executes all the functional tests defined. However, since the execution of all current implementation of test classes takes more than 10 minutes, use the complete name of the test class (package name + name of the class – sufix “Tests”). For instance, the following command executes the functional Tests implemented in the class LoginFunctionalTests:

grails test-app -functional com.collabnet.svnedge.console.ui.LoginFunctional

The command selects the test suite class “CSVN_DEV/tests/functional/com/collabnet/svnedge/console/ui/LoginFunctionalTests.groovy” to be executed, as the output of the execution of the test cases identify the environment and the location where the test reports will be saved. The recommendation here is to keep using the Eclipse STS infrastructure to save your commands execution as shown below.

svnedge-eclipse-saved-execution.png

As shown below, the functional tests execution output is the same from executing the tests using the command line or the Eclipse command as shown in the output view. The tests are prepared to be executed and save the output logs and reports in the directory “CSVN_DEV/target/test-reports”.

Welcome to Grails 1.3.4 - http://grails.org/
Licensed under Apache Standard License 2.0
Grails home is set to: /u1/svnedge/replica_admin/grails/grails-1.3.4/

Base Directory: /u1/development/workspaces/collabnet/svnedge-1.3.4/console
Resolving dependencies...
Dependencies resolved in 1565ms.
Running script /u1/svnedge/replica_admin/grails/grails-1.3.4/scripts/TestApp.groovy
Environment set to test
    [mkdir] Created dir: /u1/development/workspaces/collabnet/svnedge-1.3.4/console/target/test-reports/html
    [mkdir] Created dir: /u1/development/workspaces/collabnet/svnedge-1.3.4/console/target/test-reports/plain

Starting functional test phase ...

Once the functional tests execution finishes the execution, the test reports are written and can be accessed using a web browser. The following snippet shows the result of running the test case started above, which shows how long it took Grails to execute the 4 test cases defined in theLoginFunctionalTests test suite, the stats of how many tests passed or failed, as well as the location where the test reports are located along with the final result of PASSED or FAILED. Note that the directory “target/test-reports” is relative to the directory “CSVN_DEV” as described above.

Tests Completed in 12654ms ...
-------------------------------------------------------
Tests passed: 4
Tests failed: 0
-------------------------------------------------------
2010-09-28 12:19:11,334 [main] INFO  /csvn  - Destroying Spring FrameworkServlet 'gsp'
2010-09-28 12:19:11,350 [main] INFO  bootstrap.BootStrap  - Releasing resources from the discovery service.
2010-09-28 12:19:11,350 [main] INFO  bootstrap.BootStrap  - Releasing resources from the Operating System service.
2010-09-28 12:19:11,352 [main] INFO  /csvn  - Destroying Spring FrameworkServlet 'grails'
Server stopped
[junitreport] Processing /u1/development/workspaces/collabnet/svnedge-1.3.4/console/target/test-reports/TESTS-TestSuites.xml
                  to /tmp/null1620273079
[junitreport] Loading stylesheet jar:file:/home/mdesales/.ivy2/cache/org.apache.ant/ant-junit/jars/ant-junit-1.7.1.jar
!/org/apache/tools/ant/taskdefs/optional/junit/xsl/junit-frames.xsl
[junitreport] Transform time: 2339ms
[junitreport] Deleting: /tmp/null1620273079

Tests PASSED - view reports in target/test-reports
Application context shutting down...
Application context shutdown.

Accessing the Test Results Report

Once the execution terminates, you can have access to the test reports. This is where you will find all the answers to the test results, including the detailed information of the entire HTTP payload transmitted between SvnEdge server and the Browser emulator that the Functional Tests use. As shown below, the location of the test cases reports is highlighted as a hyper-link to the index page of the test reports. Clicking on it results on opening the Eclipe’s built-in browser view with the reports.

svnedge-functiona-tests-execution-smaller.png

This report is generated per execution, and therefore, they are deleted before each new execution. In case you need keep information of a test run, copy the contents of the directory “CSVN_DEV/target/test-reports”, as you will find reports in both HTML and XML. The report for each test suite includes the list of each test case run, the status and time of execution. The report includes 3 main output:

  • Properties: system properties used.
  • System.out: the output of the standout output of the process; same output print in the grails output, but more organized.
  • System.err: the output of the standard error of the process.

The most used output is the System.out. Clicking on this hyper-link takes you to the organized output of the traffic, highlighting the HTTP Headers, HTTP Body, redirects, test assersions and test results.

Identifying Test cases report scope

The link to the System.out output is the most important and used throughout the development of the test case, as the output of the execution of each test case is displayed in this area.

svnedge-functional-tests-raw-report-eclipse.png

Each test case has its own test result scope, and you can easily identify the initialization of the execution of a test case by the key “Output from TEST_CASE_NAME”, where “TEST_CASE_NAME” is the name of the method name that defines the test case. For instance, the log for the execution of the test cases for the LoginFunctionalTests includes the following strings:

--Output from testRootLogin--
--Output from testRegularLogin--
--Output from testDotsLogin--
--Output from testFailLogin--

The output of the execution of the HTTP Request Header of a test case is started with “>>>>>” shown as follows:

>>>>>>>>>>>>>>>>>>>> Making request to / using method GET >>>>>>>>>>>>>>>>>>>>
Initializing web request settings for http://localhost:8080/csvn/
Request parameters:
========================================
========================================
Request headers:
========================================
Accept-Language: en
Accept: */*
========================================

On the other hand, the output of the HTTP Response Header of a test case is started with “<<<<<<” as shown below. The HTTP Reponse Header Parameters are output for verification of anything used by the test cases. Note that following access to the “/csvn/” “root” context results in an HTTP Forward to the “Login Page” identified by the context “/csvn” controller “/login/auth” and, therefore, there is not “Content” available.

<<<<<<<<<<<<<<<<<<<< Received response from GET http://localhost:8080/csvn/ <<<<<<<<<<<<<<<<<<<<
Response was a redirect to
  http://localhost:8080/csvn/login/auth;jsessionid=hueqpw5eaq32 <<<<<<<<<<<<<<<<<<<<
Response was 302 'Found', headers:
========================================
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Set-Cookie: JSESSIONID=hueqpw5eaq32;Path=/csvn
Location: http://localhost:8080/csvn/login/auth;jsessionid=hueqpw5eaq32
Content-Length: 0
Server: Jetty(6.1.21)
========================================
Content
========================================

========================================

#Following redirect to http://localhost:8080/csvn/login/auth;jsessionid=hueqpw5eaq32
>>>>>>>>>>>>>>>>>>>> Making request to http://localhost:8080/csvn/login/auth;jsessionid=hueqpw5eaq32
 using method GET >>>>>>>>>>>>>>>>>>>>

If the HTTP Response contains the body payload, it will be output as is in the Content section:

<<<<<<<<<<<<<<<<<<<< Received response from
  GET http://localhost:8080/csvn/login/auth;jsessionid=hueqpw5eaq32 <<<<<<<<<<<<<<<<<<<<
Response was 200 'OK', headers:
========================================
Expires: -1
Cache-Control: no-cache
max-age: Thu, 01 Jan 1970 00:00:00 GMT
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Language: en
Content-Length: 4663
Server: Jetty(6.1.21)
========================================
Content
========================================
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
        "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
  <head>

    <title>CollabNet Subversion Edge Login</title>
    <link rel="stylesheet" href="/csvn/css/styles_new.css"
          type="text/css"/>
    <link rel="stylesheet" href="/csvn/css/svnedge.css"
          type="text/css"/>
    <link rel="shortcut icon"
          href="/csvn/images/icons/fav
......
......
......

Whenever a test case fails, the error message is output as follows:

"functionaltestplugin.FunctionalTestException: Expected content to loosely contain [user.new] but it didn't"

Looking deeper in the raw output for the string “Expected content to loosely contain user.new but it didn’t”, you see what HTML output was used for the evaluation of the test case. Sometimes an error case is related to the current UI or to an external test verification. This specific one is related to teamforge integration as the test server did not have a specific user named “user.new” located in the list of users.

Failed: Expected content to loosely contain [user.new] but it didn't
URL: http://cu082.cubit.sp.collab.net:80/sf/sfmain/do/listUsersAdmin

Writing New Functional Test Cases Suite

This section describes how to create test suites using the Functional Tests plugin. In order to maximize code-reuse, we defined a set of Abstract classes that can be used in specific type of tests as shown in the diagram below. Instead of each test case extend the regular class “functionaltestplugin.FunctionalTestCase“, we created a more general abstract class “AbstractSubversionEdgeFunctionalTests” to define general access to configuration artifact, internationalization (i18n) message keys, among others. In addition to the infrastructural utility methods, the main abstract SvnEdge test class contains a set of “often used” method executions such as “protected void login(username, password)”, which is responsible for trying to perform the login to SvnEdge for a given “username” and “password”. The result of the command can then be verified in the body of the implementing class. More details later in this section. First, any test will be implementing one of the test scenario classes: “AdminLoggedInAbstractSvnEdgeFunctionalTests” or “LoggedOutAbstractSvnEdgeFunctionalTests“. However, the test cases for the conversion process needed a specialized Abstract class “AbstractConversionFunctionalTests“, which is of type “AdminLoggedInAbstractSvnEdgeFunctionalTests” because only the admin user can perform the conversion process.

svnedge-functional-tests-abstract-classes.png

As it is shown in the UML Class Diagram above, the AbstractSvnEdgeFunctionalTests extends from the Grails Functional Test class. In this way, it will inherit all the basic method calls for assertions from JUnit and grails. The class is shown in RED because it is a “PROHIBITED” class. That is, no other classes but the GREEN ones should directly extend from the RED class. The fact is that the test cases implementation in Subversion Edge only has 2 different types of tests and, therefore, new test cases should only inherit from “AdminLoggedInAbstractSvnEdgeFunctionalTests” or “LoggedOutAbstractSvnEdgeFunctionalTests“. Similarly, additional functional tests to verify other scenarios from the conversion process has to inherit the behavior of the abstract class “AbstractConversionFunctionalTests“.

Basic Abstract Classes

As described in the previous sections, the two major types of test cases are related to when the Admin user is logged in and when there is no user logged in. That is, tests that require different users to login can use the latter test class to perform the login and navigate through the UI. Before continuing, It is important to note that the Functional Tests implementation are based on JUnit using the 3x methods name conversions. For instance, the methods “protected void setUp()” and “protected void tearDown()” are called before and after running each test case defined in a test class. Furthermore, it is also important to to call the super implementation of each of the methods because of the dependency on the Grails infrastructure. Take a look at the following JavaDocs to have an idea of the basic utility methods implemented on each of them.

Just as a reminder, upon executing the test cases defined in a class, JUnit executes the method “setUp()”. If any failure occurs in this test, Grails will fail not only the first test case, but ALL the test cases defined in the Test Suite. This is related to the fact that the method “setUp()” is executed before the execution of each test case. Once the execution of a given test case is finished, the execution of the method “tearDown()” is performed. Any failure on this method also results in ALL test cases to fail.

The test cases defined in the abstract classes are defined to give the implementing concrete classes the access to all the important features for the test cases. As mentioned earlier, utility methods to access the configuration properties and internationalization (i18n) messages are provided. In addition, convenient test cases for performing assertions are also implemented in the Abstract classes. The next sections will provide in-depth details in the implementation of the test suites.

Concrete Functional Tests Suites Implementation

The simplest implementation of Functional Tests is the LoginFunctionalTests used as an example before. However, executing the scenario to be implemented using the production version is the first recommended step before writing any piece of code. You need to collect information about the scenario to be executed, choose UI elements to use in your test case, etc. For instance, consider the execution of the Login scenario of a user with wrong username. By default, the development and test environments of Subversion Edge will be bootstrapped with different users such as “admin”, “user” and “user.new”. Considering a scenario where the attempt to login with a wrong username called “marcello” is performed as the result is shown in the screenshot below:

svnedge-functional-test-scenario-login-error.jpg

The test case shows that by entering a wrong username and password, an error message is shown as the server responded with a complete and correct page (HTTP Response Code 200), although an error occurred during the execution of the test case. Based on those information, the automated tests can be written in the test suite to verify the possible test cases for the different users in SvnEdge, including the implementation of the wrong input. Note that the implementation of each test case have the procedures to be verified in the super class through the call to a method “testUserLogin” whereas the implementation of the testFailLogin() is the only implementation that is located in the LoginFunctionalTests. Other abstract and concrete test classes are shown in the UML Class diagram below. Note that the YELLOW classes are the concrete classes that extends the functionality from the abstract classes.

svnedge-functional-tests-abstract-and-concret-classes.png

  • LoginFunctionalTests.html: The concret functional tests class suite that verify the login for each of the different usernames, as well as the failure tests.
package com.collabnet.svnedge.console.ui

import com.collabnet.svnedge.LoggedOutAbstractSvnEdgeFunctionalTests;

class LoginFunctionalTests extends LoggedOutAbstractSvnEdgeFunctionalTests {

    @Override
    protected void setUp() {
        super.setUp();
    }

    @Override
    protected void tearDown() {
        super.tearDown();
    }

    void testRootLogin() {
        this.loginAdmin()
    }

    void testRegularLogin() {
        this.loginUser()
    }

    void testDotsLogin() {
        this.loginUserDot()
    }

    void testFailLogin() {
        this.login("marcello", "xyzt")
        assertContentContains getMessage("user.credential.incorrect",
            ["marcello"] as String[])
    }
}

The fact is that the methods “loginAdmin()”, “loginUser()”, etc, are implemented in the AbstractSvnEdgeFunctionalTests to allow code reuse in other test classes, and therefore, the test case “testFailLogin()” uses the basic method “AbstractSvnEdgeFunctionalTests.login(username, password)” for the verification of a user that does not exist. Also, note that the verification of the login scenario is as simple as verifying if the a given String exists in the resulting HTTP Response output. For instance, when attempting to login with a user that does not exist, the error message “Wrong username/password provided for the user “marcello”. This is due to the fact that the String is located in the messages bundle “user.credential.incorrect” and the method “getMessage()” is the helper method implemented in the class AbstractSvnEdgeFunctionalTests.

Another important thing to keep in mind is about code convention. The name of test cases are defined as cameCase, prefixed by the keyword “test”. The name of the test cases can be as long as “AbstractConversionFunctionalTests.html“. The most important point here is that the name of the method must be coherient to the steps being performed. Also, note that Groovy accepts a more relaxed code notation, which makes it easy to read:

        // JAVA method invocation Notation
        this.login("marcello", "xyzt")

        // GROOVY method invocation Notation
        this.login "marcello", "xyzt"

When it comes to the real implementation of a given scenario, you have to constantly refer to the Grails Functional Tests documentation and that’s where you will find your “best friends”. Yes!!! Your best friends! The assert methods that will help you verify the results of the HTTP Response. But first, let’s take a look at the implementation of the basic method that performs “login” and “logout”. As we know from the definition of the abstract classes, each time a method from a class that extends “LoggedOutAbstractSvnEdgeFunctionalTests” is executed, the method setUp() inherited from this class is executed first.

public abstract class LoggedOutAbstractSvnEdgeFunctionalTests extends AbstractSvnEdgeFunctionalTests {

    @Override
    protected void setUp() {
        //The web framework must be initialized.
        super.setUp()

        get('/')
        assertStatus(200)

        if (this.response.contentAsString.contains(
                getMessage("layout.page.login"))) {
            this.logout()
        }
    }

    @Override
    protected void tearDown() {
        //Stop Svn Server in case it is running
        this.stopSvnServer()

        this.logout()

        //The tear down method terminates all the web-related objects, and
        //therefore, must be performed in the end of the operation.
        super.tearDown()
    }
}

Note that the implementation of the concrete classes MUST make a call to the super.setUp() first, so it executes the depending steps. As you can see in the class implementation below, the method setUp() will first make a request to “/”, that is, “http://localhost:8080/csvn/&#8221; since the RESTful method “get()” uses the base URL + the context name “/csvn”. Then, the first assertion is important to verify that the Server is up and running, as well as verifying that the request did not return any error in the UI. Bookmark the RFC2616 and use the HTTP Response Status Codes as required. The default one to verify is “200″, even though the scenario results in an error message as the test case “LoginFunctionalTests.testFailLogin()”. Finally, after verifying if the status code is as expected, the test uses the object “response” to verify if the HTML content contains the string identified by the key “layout.page.login” in the the i18n artifact “CSVN_DEV/grails-app/i18n/messages.properties”. For this case, the method is verifying for the key:

layout.page.login=Login

Following the way JUnit implements the test execution cycle, the method “tearDown()” is executed right after each method “testXYZ()” is executed. In our case, there are a few steps to be verified before terminating the test case. As it might be necessary, the HTTP server might have been started during a test case, and therefore, the method “stopSvnServer()” is called. This is specially placed in the “highest” abstract class because all types of test cases might want to start the HTTP server from the status page. After an HTTP reques to “/” is performed, the verification to the output is performed to and in case is necessary, the method “logout()” is executed as implemented in the abstrac class “AbstractSvnEdgeFunctionalTests“. That is, if the HTML code from the response object contains the string identified by the key “layout.page.logout” (LOGOUT), then click in the link “LOGOUT”. Then, assert if the HTTP response status was equals to 200 and that the content contains the header string “Login” identified by the key “login.page.auth.header”.

    /**
     * Performs the logout by clicking on the link.
     */
    protected void logout() {
        def logout = getMessage("layout.page.logout")
        if (this.response.contentAsString.contains(logout)) {
            click logout
        }
        assertStatus(200)
        assertContentContains(getMessage("login.page.auth.header"))
    }

Similarly, test cases that perform login will essentially fill out the login form and click on the button “Log in”. The basic implementation of the method “login(username, password) is shown below. The HTTP GET Request to the page “/login/auth” is performed followed by the assert of the status code. Then, if the test environment keeps the user Logged In as a result of a failure of any previous test case, the verification if the user is logged in is performed so that the call to the method “logout()”, as shown above, is performed. Finally, when the user s in the front page, the login form is filled out with the correct values. Please refer to the “Grails Functional Tests Documentation” for details on how to fill out and submit form fields, but it should be straightforward. The only detail needed is to capture the name of the form defined in the HTML code. A good helper way is to use Google Chrome or Firefox “Web Developer” plugin to capture the UI element “ids”. Specifically for the form submission, the ID of the form and the “id”s from the form fields are necessary. Then, the label value of the SUBMIT button is necessary, and as shown in the code below, that string is located in the string with the key “layout.page.login”.

svnedge-functional-tests-browser-show-form-properties.png

    protected void login(username, password) {
        get('/login/auth')
        assertStatus(200)

        if (this.response.contentAsString.contains(
                getMessage("layout.page.login"))) {
            this.logout()
        }
        def login = getMessage("layout.page.login")
        form('loginForm') {
            j_username = username
            j_password = password
            click login
        }
        assertStatus(200)
    }

It is extremely important to note here a very hard problem when it comes to “clickable” items in the UI. Since we are using a mix of the Grails GSP tags and some CSS styles from TeamForge, Grails creates the buttons in a different way for Forms and Places without the HTML form entity. Whenever a form was generated by Grails, the Submit button like the “login” one showed above will only respond to the command “click LABEL” inside of the form() closure. On the other hand, the command “click LABEL” will only perform its action when declared outside of the form() closure. Different examples of these GOTCHAS have been found while the Conversion tests were being written.

To summarize the steps to automate manual tests with corresponding Functional Tests, the suggested steps are as follows:

  1. . Perform the test scenario manually and gather necessary information about the User Interface, choosing unique elements that are present in the resulting action. For the case of login, the verification of the string “Logged in as:” is perfomed. For tests exploring failures and error messages, choose to assert about the existence of these error messages.
  2. . Once you are familiar about how the scenario behaves, create the main Test Case Suite by extending from one of the GREEN abstrac classes in the UML Class Diagram shown above. Choose the names related to the component.
  3. . Propose code reuse by implementing new methods in the AbstractSvnEdgeFunctionalTests if necessary, or if other components will use the same implementation. If not, keep the implementation in the test class developed.
  4. . Add JavaDocs to the methods that are going to be inherited or are difficult to understand. Try documenting the method execution before writing the test case as you will understand the scope of the test better. Next section will provide a good understanding on how to write those supporting documentation.

Advanced Functional Tests Techniques

Once you get used to the way to write automated test cases, you should be able to implement complex test cases that involves not only the local Subversion Edge server, but also external servers such as the TeamForge server used during the tests of conversion process. Don’t forget to document the steps in a structured way inside the JavaDocs, as documentation later makes it easy to understand the purpose of the tests.

Note that the JavaDocs of the classes contain a more detailed specification of the execution of the test cases. For example, the sentences starting with “Verify” are related to the assertions necessary to verify the test case, while “Go to” are related to the HTTP Request method “get()”. Each of the sections are identified so that the implementation of the methods setUp(), tearDown(), and the actual method are explicitly written using Groovy. The source code has more detailed implementation of the test cases.

Test Case 1: Successful conversion to TeamForge Mode

   * SetUp
        * Login to SvnEdge
        * Revert to Standalone Mode in case on TeamForge Mode

   * Steps to reproduce
         * Go to the Credentials Form
         * Enter correct credentials and existing CTF URL and try to convert;

   * Expected Results
         * Successful conversion message is shown
         * Login -> Logout as admin
         * Verify that the server is on TeamForge mode;
         * Login to CTF server and verify that the system ID
            from the SvnEdge server is listed on the list of integration servers

   * Tear Down
         * Revert conversion if necessary
         * Logout from the SvnEdge server

The implementation of complex test cases might require verification of different properties of local and external resources. The example of the conversion process was the first challenge of this nature we had to implement. The following code snippet is the implementation of assertions of the conversion as the Expected results. Note that the method custom assertion methods were written to support this implementation (“assertProhibitedAccessToStandaloneModeLinksWorks()” and “assertConversionSucceededOnCtfServer()”.

    /**
     * Verify that the state of the conversion is persisted:
     * <li>The local server shows the TeamForge URL
     * <li>The CTF server shows the link to the server. This can be verified
     * by the current system ID on the list of integration servers.
     */
    protected void assertConversionSucceeded() {
        // Step 1: verify that the conversion is persistent
        this.logout()
        assertStatus 200

        this.loginAdmin()
        assertStatus 200

        assertContentContains(getMessage("status.page.url.teamforge"))
        // verify that the software version is still shown
        assertContentContains(getMessage("status.page.status.version.software"))
        assertContentContains(
            getMessage("status.page.status.version.subversion"))

        get('/server/edit')
        assertStatus 200
        assertContentContains(getMessage("server.page.leftNav.toStandalone"))

        // verify that prohibited links work
        assertProhibitedAccessToStandaloneModeLinksWorks()

        // Step 2: verify that the CTF server DOES list the system ID
        assertConversionSucceededOnCtfServer()
    }

Using the response object

As seen in some of the examples, the assertions are the way to verify that a given expected value exists in the HTTP Response payload received from the Server. However, whenever the test case needs to make a decision based on the contents of the response object, you can use the direct access to the response object. For instance, instead of failing a test that needs to have the user logged out, this code snippet verifies if the user is logged in and then performs the logout procedure. The same logic can be applied in different scenarios such as verifying if the server is started/stopped by verifying the status page button. Similarly, the test can verify if there are any created repositories in the file-system before creating a new test repository.

        if (this.response.contentAsString.contains(getMessage("layout.page.login"))) {
            this.logout()
        }

Dealing with external resources

The nature of Subversion Edge requires the integration with TeamForge, and how about testing the state of both systems in the same test case? Considering the Grails Plugin allows external HTTP requests during tests, why not performing the same steps an Admin would do to verify the state of the server? This was a bit tricky, but works like a charm. As we had designed before, reusing the configuration was the first step to define which remote TeamForge to use during tests. Then, the Test case could take care of automating the ways to generate the URL for the CTF server based on the configuration parameters during the tests of conversion. Here’s the closure in the file “CSVN_DEV/grails-app/conf/Config.groovy” that one can change which TeamForge server to use (svnedge.ctfMaster).

    ctfMaster {
        ssl = false
        domainName = "cu082.cubit.sp.collab.net"
        username = "admin"
        password = "admin"
        port = 80
        systemId = "exsy1002"
    }

Taking a closer look of what we needed, this is related to the assertions for the last expected result “Login to CTF server and verify that the system ID from the SvnEdge server is listed on the list of integration servers”. So, the translation of this sentence into Groovy code originated the method call “AbstractConversionFunctionalTests.assertConversionSucceededOnCtfServer()”, as the steps to perform this assertion are used by all different scenarios. As implemented, the first step requires that the login to TeamForge take the user to the Administration page “List Integrations” using the method ” this.goToCtfListIntegrationsPage()” before verifying if the system ID saved by the conversion process exists in that page. However, observations on how the HTTP Request flow in TeamForge works was necessary to understand the forwards after the user is logged in. After building the necessary parameters in the method “loginToCtfServerIfNecessary()” was implemented with all the needed values from both the Grails Config.groovy and from the environment. As warned before, the clickable elements of forms can differ from Subversion Edge and TeamForm, and therefore, the grails element “click LABEL” was used here outside the form closure. Finally, don’t be tempted to verify strings in TeamForge using i18n as they are different and Subversion Edge does not have direct access to them. Prefer validating steps using form elements or IDs produced by TeamForge as the UI can change on the remote server.

    /**
     * Verifies that the CTF server lists the current ctf server system ID.
     */
    protected void assertConversionSucceededOnCtfServer() {
        // NOTE: NO I18N HERE SINCE TEAMFORGE IS NOT I18N READY
        this.goToCtfListIntegrationsPage()
        assertContentContains(CtfServer.getServer().mySystemId)

        assertContentContains("Site Administration")
        assertContentContains("SCM Integrations")
        def appServerPort = System.getProperty("jetty.port", "8080")
        def csvnHostAndPort = server.hostname + ":" + appServerPort

        // TeamForge removes any double-quotes (") submitted via the SOAP API.
        assertContentContains("This is a CollabNet Subversion Edge server in " +
            "managed mode from ${csvnHostAndPort}.")
    }

    /**
     * Goes to the list of integrations on the CTF server
     */
    private void goToCtfListIntegrationsPage() {
        // Goes to the list integrations page
        // http://cu073.cloud.sp.collab.net/sf/sfmain/do/listSystems
        get(this.makeCtfUrl() + "/sf/sfmain/do/listSystems")
        this.loginToCtfServerIfNecessary()
    }

   /**
    * Makes login to CTF server from a given point that connects to the server.
    * In case the response content DOES NOT contains the string "Logged in as",
    * then make the login. The resulting page is the redirected page requested
    * earlier.
    */
    private void loginToCtfServerIfNecessary() {
        //NOTE: NO I18N HERE SINCE TEAMFORGE IS NOT I18N READY
        if (!this.response.contentAsString.contains("Logged in as")) {
            assertStatus 200
            def ctfUsername = config.svnedge.ctfMaster.username
            def ctfPassword = config.svnedge.ctfMaster.password
            form("login") {
                username = ctfUsername
                password = ctfPassword
            }
            // the button is a link instead of a form button. Use it outside
            // the form closure.
            click "Log In"
            assertStatus 200
        }
    }

Test Case Suites Needed

A few test cases have been written for specific functionalities of the application. However, here’s some of the test cases that can be developed.

* User Functional Tests
- Create User of each type
  - Login/Logout
  - Verify access to prohibited URLs
  - Access SVN and ViewVC Pages
- List Users
- Delete User
- Change User password
  - Logout and login with the new password.
  - Access SVN and ViewVC pages with new password
- View Self page
- Try changing the server settings, accessing other admin sections

* Repos Functional Tests
- Create Repo
- Discover Repos
- List Repos
- Edit Access Rules
  - Login with users without access to specific repos without access

* Statistics Functional Tests
- Access the pages for statistics

* Administration Functional Tests
- Changing server settings as Admin
- Changing the server Authentication settings
  - Login / Logout and verify changes.
  - Restart server after changing settings

* Server Logs Functional Tests
- Change log level
- View log Files
- View non-existing file
- View existing file
- View log files

* Packages Update Functional Tests
- Update the software packages
- Convert the server and try to update the server to new version

If you have any questions regarding the Functional Tests specification, please don’t hesitate to send an email to dev-svnedge@ctf.open.collab.net.

Marcello de Sales – Software Engineer - CollabNet, Inc.

Categories: java, Subversion

BlackBeltFactory: If you are a teacher at heart and love technology, this is your place…

While studying Java for fun and to take the Java certification from the Sun Microsystems back in 2004, I used to hang out in different tutorial websites with reviews for the exams. I was still living in Brazil, where I grew up, when I first started studying Java at the University and seeing passionate about the “Write Once, Run Anywhere” premise… When I found JavaBlackBelt in 2006, I joined it to try perfecting my Java skills and keep up-to-date with the language. Given the transformation of how Social Networking took the Internet, everything changed since then, as they changed their branding and name to BlackBeltFactory, as well as have added social interaction capabilities and a market place for developers, technologists and the ones who love to teach and learn.

My previous experience was just related to my own learning experience: practice/learn the fundamentals of the Java Programming Language. It was essentially a website where users could go and take exams in different subjects related not only with Java, but also with relating technologies such as XML, Web Services, Hibernate, etc. However, I must confess that it is hard to keep up with the exams when you have your day-to-day job, school obligations, etc. I had conquered the Java Blue Belt and I was facing a lot of changes starting with moving from New York to California for the dream of the Silicon Valley and then having the opportunity to engage on another 2 years of my dreamed MS and work with what I love: Java and Computer Science. The academic world can take all of your time with research papers to read (ACM was my browser’s start page) and exams/finals. On the other hand, the only place where I could focus on practicing Java was my research projects: http://userwww.sfsu.edu/~msales/ (my thesis and conferences). So, I never became a JavaBlackBelt per say and I was cleaning my mailbox when the name BlackBeltFactory was showing up on older and older emails. Yes, JavaBlackBelt had evolved and “taken the Social Networking train”. There are a list of changes listed on their website here.

The very first basic change BlackBeltFactory did was to take advantage of their infrastructure and start thinking on a more “language/vendor-agnostic” approach: why not offering training in other languages? I saw C# another programming language listed on their website and I must say that the BlackBeltFactory was a cool place to hang out and take exams prepared and reviewed by peers in the community. It is definitely a place to challenge your skills set on a given track. You can only take exams when you provide contributions: review questions, add comments, etc. This approach requires the user to be active in the learning community.

In my opinion, BlackBeltFactory’s natural progression could not have been different: take advantage of the Social Networking capabilities that we are currently live to provide the user’s better learning experience. Teaching is one of my passions and I must say that BlackBeltFactory did a great job in adding features like “Become a coach”. After you have passed the exam to be a coach, it seems that you can either create a free or a paid training to someone. Similarly, users interested about learning can ask others about services of teaching a specific topic. This marketplace is healthy and very interesting to me in the sense that I don’t need to drive anywhere to teach someone something I’m passionate about. As far as I could see, BlackBeltFactory offers the process for both participants to engage on a program. Hummm… Now I don’t need to think about going for PhD and teach! :D

Another great feature is the translation capability. Although the previous version of the website branded as JavaBlackBelt was awesome for English Speaking users, the platform could not capture users of other nationalities and without knowledge of English. As a Brazilian, I can say that it is difficult, in general, to the ones who are starting with our field of technology/science to properly “bootstrap” their career because of the restricted access to content in Portuguese. That’s why I made sure I had Portuguese as one of the first translated versions of CollabNet Subversion Edge as I’m working in the project. BlackBeltFactory just gave me yet another reason to stick around and contribute to their community as I have a passion for learning and sharing knowledge.

All in all, I think I have to squeeze more of my time to play around in the BlackBeltFactory! For the love of teaching, I have already joined 2 Brazilian groups for the translations and I will make time to review exams and try to get my Java BlackBelt :D I could not get even the yellow in Kung Fu when I was 15, but I think I might have potential for Java. I have liked my LinkedIn and Twitter accounts, which are nice as a linking resource.

Google CodeSearch: your best friend to Reverse Engineer Android Code to your Android app

September 30, 2010 4 comments

I’m on my 4th week developing my first Android Application: a client to the Discovery API for the CollabNet Subversion Edge Server, which I started writing to validate the implementation of the Discovery API I had written based on a previous implementation using jmDNS for the Subversion Edge server. The first version was definitely exciting because I wanted to see how the Discovery API, which uses the Bonjour Protocol (ZeroConf/mDNS from Apple), behaved on Android. Since I only have some evenings and weekends to work on it, it has become difficult to keep track of all the topics learned so I decided to do a crash course and watched the video “Beyond Helloworld” and get an insight of what is to be developing on Android. I got hooked!!! For a Java/Linux lover myself, it is very addicting to develop on it because I was once a JavaME developer and did develop a few applications at my Motorola Brazil Test Center internship. Anyway, here’s the YouTube video I did when I first finished the version 0.0.1 :D There are still technical Strings exposed to the UI but, as you can see, I had fun producing a video using the Android Screencast and Mac iMovie.

After getting familiar with the terminology with Activity and Service Classes, etc, I started developing different features one by one. Considering each application has its own unique requirements, you definitely need to understand the API, read the Java Docs and the fundamental documentation provided by Google and others via blog posts. However, the more you start developing a customized application with icons, colors, etc, the more addicted you get :)

Google Android - Add to Home screen Dialog with images

So, customizing components sometimes is tricky because of the different ways to implement something. However, there are similarities on UI components and behavior you want to use in your application to maintain the standard behavior of you application. That’s when I had a requirement that add a custom Dialog to an item of a list of Subversion Edge servers that are discovered in the network, showing the options for that type of server. The Dialog needs to have the associated icon that is displayed in the list, and the options with associated icons. The section Creating Dialogs does not offer the implementation of customized dialog I wanted, not an example of an Adapter that does the trick, and I did try different ways to create a layout with icons, but I couldn’t get it right. Maybe that’s the things a novice Android developer does…

Considering another options to get the implementation of an a Dialog Adapter for my requirement, I thought there might be an example of what I wanted. That’s when I remembered that the long press event of the home screen shows a Dialog with menu options with the icons, as shown in screenshot below:

I thought to myself, “Wait, Android is Open-Source!”, and it’s not the iPHONE! :) That moment I went to Google Code Search, a product still branded as to be in “Google Labs” that I use to look for open-source code (Eclipse, Subversion, etc, etc). So, I started using the menu title “Add to Home screen”  as my search key and I could only find it when I used the exact value found in the source-code sufixing the string with “</string>”, found in the strings.xml. Here’s the URL of the search.

http://www.google.com/codesearch/p?hl=en#4r7JaNM0EqE/res/values/strings.xml&q=%22Add%20to%20Home%20screen%3C/string%3E%22&sa=N&cd=1&ct=rc

<!-- Shortcuts -->
    <skip />
    <!-- Title of dialog box -->
    <string name="menu_item_add_item">Add to Home screen</string>

Bingo! I found the String with its associated  key “menu_item_add_item” and the only thing to do now is to find the implementation of the method that creates the menu item and reuse that on my own customized version for my long item press. So, I did another search for the key “menu_item_add_item” and another success!!! Here’s the implementation of the dialog showed in the screenshot above.

http://www.google.com/codesearch/p?hl=pt-BR#4r7JaNM0EqE/src/com/android/launcher/Launcher.java&q=menu_item_add_item&sa=N&cd=1&ct=rc

    /**
     * Displays the shortcut creation dialog and launches, if necessary, the
     * appropriate activity.
     */
    private class CreateShortcut implements DialogInterface.OnClickListener,
            DialogInterface.OnCancelListener, DialogInterface.OnDismissListener,
            DialogInterface.OnShowListener {

        private AddAdapter mAdapter;

        Dialog createDialog() {
            mWaitingForResult = true;

            mAdapter = new AddAdapter(Launcher.this);

            final AlertDialog.Builder builder = new AlertDialog.Builder(Launcher.this);
            builder.setTitle(getString(R.string.menu_item_add_item));
            builder.setAdapter(mAdapter, this);

            builder.setInverseBackgroundForced(true);

            AlertDialog dialog = builder.create();
            dialog.setOnCancelListener(this);
            dialog.setOnDismissListener(this);
            dialog.setOnShowListener(this);

            return dialog;
        }

Great, note that the implementation of the dialog has its own implementation of the Adapter called “AddAdapter”, a second piece of implementation where the icons and titles might be defined. So, I looked for the Adapter class “AddAdapter.java” and I was happy to find everything needed there, and for my surprise, no layout implementation in XML. I guess sometimes hard-coded layouts are all you need anyway if you don’t reuse the same thing in a different place. I then looked for the class and I found the implementation of the menu items.

http://www.google.com/codesearch/p?hl=pt-BR#4r7JaNM0EqE/src/com/android/launcher/AddAdapter.java&q=AddAdapter.java%20LayoutInflater&l=63

    public AddAdapter(Launcher launcher) {
        super();

        mInflater = (LayoutInflater) launcher.getSystemService(Context.LAYOUT_INFLATER_SERVICE);

        // Create default actions
        Resources res = launcher.getResources();

        mItems.add(new ListItem(res, R.string.group_shortcuts,
                R.drawable.ic_launcher_shortcut, ITEM_SHORTCUT));

        mItems.add(new ListItem(res, R.string.group_widgets,
                R.drawable.ic_launcher_appwidget, ITEM_APPWIDGET));

        mItems.add(new ListItem(res, R.string.group_live_folders,
                R.drawable.ic_launcher_add_folder, ITEM_LIVE_FOLDER));

        mItems.add(new ListItem(res, R.string.group_wallpapers,
                R.drawable.ic_launcher_wallpaper, ITEM_WALLPAPER));

    }

As I mentioned earlier, each application is implemented differently and the problem I had now was just a Java refactoring, as I had implemented my events differently. In the end, it payed off and I could implement my own custom dialog. That shows how awesome Google Code Search is! I will take that route the next time I need to implement similar functionalities to my Subversion Edge Android Discovery Client and other future apps.

Subversion Edge Discovery Android Client - Custom Dialog with Icons

mongoDB Shards, Cluster and MapReduce: experiments for Sensor Networks

This documents the use of mongoDB as the persistence layer for the collected data from NetBEAMS. It is divided into sections of setup and the CRUD (Create, Retrieve, Update, Delete) operations, as well as advanced topics such as data replication, the use of MapReduce, etc. This document is a copy of the experiments performed for my Masters Thesis Report entitled “A Key-Value-Based Persistence Layer for Sensor Networks“. The original wiki documentation can be found at MongoDBShardsClusterAndMapReduce.

The setup of the mongoDB shards must be performed on each cluster node. First, the relevant processes are started, and then the cluster must be configured with each of the shards, as well as indexes of the collections to be used. Before continuing on this section, refer to the following documentation:

In order to start collecting data, the mongoDB’s server must be set up on a single or distributed way. Using the distributed cluster version requires starting the commands on the following listing:

marcello@netbeams-mongo-dev02:~/development/workspaces/netbeams/persistence$ ps aux | grep mongo
marcello  3391  0.0  0.2  67336  3328 pts/1    Sl   12:38   0:01 mongod --dbpath data/shards/shard-1/ --port 20001
marcello  3397  0.0  0.2  59140  3280 pts/1    Sl   12:38   0:01 mongod --dbpath data/shards/shard-2/ --port 20002
marcello  3402  0.0  0.2  59140  3276 pts/1    Sl   12:38   0:01 mongod --dbpath data/shards/shard-3/ --port 20003
marcello  3406  0.0  0.3 157452  3980 pts/1    Sl   12:38   0:01 mongod --dbpath data/shards/config --port 10000
marcello  3431  0.4  0.2  62004  3332 pts/1    Sl   12:38   0:35 mongos -vvv --configdb localhost:10000
marcello  3432  0.0  0.0   5196   704 pts/1    S    12:38   0:00 tee logs/mongos-cluster-head.log
In summary, these processes are defined as follows:
  • Shards Node: each shard process “mongod” is responsible for managing its own “chunks” of data on a given “dbpath” directory, on a given port number. These processes are used by the cluster head “mongos”;
  • Cluster Metadata Server Node: the main metadata server of the cluster can be located on a local or foreign host. This listing above shows the metadata server “config” located in the same server, managed by the “mongod” process. It carries information about the databases, the list of shards, and the list of “chunks” of each database, including the location “Ip_address:port” of them;
  • Cluster Head Server: the orchestration of the cluster is performed by the “mongos” process. It connects to the cluster head to select which shard to be used, statistics about counters, etc. This is the main process that accepts the client requests.

Make sure to proxy the output of the processes to log files. As shown in the Listing above, the process “tee” is capturing the output for the process “mongos”. mongoDB’s process has additional parameters for that matter as well.

Considering that the proper processes are running, specially the metadata server and the main cluster head, the client process can be started to issue the commands to enable shards on a given database system. Since mongoDB client’s interface uses Javascript as the main programming language abstraction to manipulate data, a script can be used to automate the process of setting up the server. Before continuing, make sure you have covered the mongoDB’s documentation on how to setup database shards:

First, connect to the server using the client process “mongo”, as shown in the following listing:

marcello@netbeams-mongo-dev02:~/development/workspaces/netbeams/persistence$ mongo
MongoDB shell version: 1.2.0
url: test
connecting to: netbeams
Sun Dec 20 14:22:49 connection accepted from 127.0.0.1:39899 #5
type "help" for help

After connected to the server through the client, get references to 2 important databases: the “admin” and “config”. The “admin” is a database system responsible for running commands to the cluster server, while the “config” is the reference to the metadata server. The following listing shows the use of the method “db.getSisterDB()” to retrieve those references:

> admin = db.getSisterDB("admin")
admin
> config = db.getSisterDB("config")
config
> 

Once the references are available, the use of the names as shortcuts makes the access better. Let’s add each shards that are running on the local and on the foreign servers (192.168.1.2) on different communication ports. It is important to note that the issued commands are executed on the metadata server “config”.

> admin.runCommand( { addshard: "192.168.1.2:20001" } )
Sun Dec 20 16:04:02 Request::process ns: admin.$cmd msg id:-2097268492 attempt: 0
Sun Dec 20 16:04:02 single query: admin.$cmd  { addshard: "192.168.1.2:20001" }  ntoreturn: -1

> admin.runCommand( { addshard: "192.168.1.2:20002" } )
Sun Dec 20 16:04:03 Request::process ns: admin.$cmd msg id:-2097268491 attempt: 0
Sun Dec 20 16:04:03 single query: admin.$cmd  { addshard: "192.168.1.2:20002" }  ntoreturn: -1

> admin.runCommand( { addshard: "localhost:20001", allowLocal: true } )
> 

In order to be added into the list, a shard server must be running. In case the shard is down at this point, it will be not added into the list of available shards. On the other hand, if it is added and it goes down, the mongos keeps sending heartbeat to verify if the shard has come back. Anyway, use the command “listshards” to list the existing shards that the cluster head can use.

> admin.runCommand( { listshards:1 } )
Sun Dec 20 16:04:03 Request::process ns: admin.$cmd msg id:-2097268490 attempt: 0
Sun Dec 20 16:04:03 single query: admin.$cmd  { addshard: "localhost:20001", allowLocal: true }  ntoreturn: -1
Sun Dec 20 16:04:03 Request::process ns: admin.$cmd msg id:-2097268489 attempt: 0
Sun Dec 20 16:04:03 single query: admin.$cmd  { listshards: 1.0 }  ntoreturn: -1
{
        "shards" : [
                {
                        "_id" : ObjectId("4b2e8b3f5e90e01ce34de6ea"),
                        "host" : "192.168.1.2:20001"
                },
                {
                        "_id" : ObjectId("4b2e8b3f5e90e01ce34de6eb"),
                        "host" : "192.168.1.2:20002"
                },
                {
                        "_id" : ObjectId("4b2e8b3f5e90e01ce34de6ec"),
                        "host" : "localhost:20001"
                }
        ],
        "ok" : 1
}
> 

Enabling the shards means to give the metadata server “config” the name of the database to be sharded, as well as the definition of the shard keys. The function “enablesharding” receives the name of the database system. The following listing shows the database “netbeams” being enabled. Later, the definition of the shard key must be given, as the key “observation.pH” is defined as the shard key:

> admin.runCommand({enablesharding:"netbeams"})
{"ok" : 1}
admin.runCommand( { shardcollection: "netbeams.SondeDataContainer", key: { "observation.pH" : 1} } )
Sun Dec 20 16:04:03 Request::process ns: admin.$cmd msg id:-2097268488 attempt: 0
Sun Dec 20 16:04:03 single query: admin.$cmd  { enablesharding: "netbeams" }  ntoreturn: -1
Sun Dec 20 16:04:03 Request::process ns: admin.$cmd msg id:-2097268487 attempt: 0
Sun Dec 20 16:04:03 single query: admin.$cmd  { shardcollection: "netbeams.SondeDataContainer", key: { observation.pH: 1.0 } }  ntoreturn: -1
{"collectionsharded" : "netbeams.SondeDataContainer" , "ok" : 1}
> 

The chunks show the different sections of the data. By using the reference to the metadata database server, list the different shards “config.chunks.find()” to list the documents.

> config.chunks.find()
{ "lastmod" : { "t" : 1261341503000, "i" : 1 }, "ns" : "netbeams.SondeDataContainer", "min" : { "observation" : { "pH" : { $minKey : 1 } } },
"minDotted" : { "observation.pH" : { $minKey : 1 } }, "max" : { "observation" : { "pH" : { $maxKey : 1 } } }, "maxDotted" : { "observation.pH" : { $maxKey : 1 } },
"shard" : "192.168.1.2:20002", "_id" : ObjectId("4b2e8b3fb342bcd910b62ec9") }
> 

The next step is to create the indexes of the expected keys. This procedure can be defined after the documents are inserted. In general, defining indexes slows down on “Create” operations, but speeds up “Retrieval” ones. In order to proceed, make sure you have covered the documentation on mongoDB’s Indexes.

  • mongoDB Indexes: this is the documentation regarding indexes of keys on mongoDB.

Note, in the following Listing, that the keys are written to the metadata server “config”. A reference to the database “netbeams” is acquired by using the function “db.getSisterDB()” as it was used for the databases “config” and “admin”. The method “db.collection.ensureIndex()” is used.

> netbeams = db.getSisterDB("netbeams")
netbeams
> netbeams.SondeDataContainer.ensureIndex( { "message_id":1 } )
Sun Dec 20 16:04:03 Request::process ns: netbeams.system.indexes msg id:-2097268486 attempt: 0
Sun Dec 20 16:04:03  .system.indexes write for: netbeams.system.indexes
Sun Dec 20 16:04:03 Request::process ns: netbeams.$cmd msg id:-2097268485 attempt: 0
Sun Dec 20 16:04:03 single query: netbeams.$cmd  { getlasterror: 1.0 }  ntoreturn: -1
Sun Dec 20 16:04:03 Request::process ns: test.$cmd msg id:-2097268484 attempt: 0
Sun Dec 20 16:04:03 single query: test.$cmd  { getlasterror: 1.0 }  ntoreturn: -1

netbeams.SondeDataContainer.ensureIndex( { "sensor.ip_address":1 } )
netbeams.SondeDataContainer.ensureIndex( { "sensor.location.latitude":1 } )
netbeams.SondeDataContainer.ensureIndex( { "sensor.location.longitude":1 } )
netbeams.SondeDataContainer.ensureIndex( { "time.valid":1 } )
netbeams.SondeDataContainer.ensureIndex( { "time.transaction":1 } )
netbeams.SondeDataContainer.ensureIndex( { "observation.WaterTemperature":1 } )
netbeams.SondeDataContainer.ensureIndex( { "observation.SpecificConductivity":1 } )
netbeams.SondeDataContainer.ensureIndex( { "observation.Conductivity":1 } )
netbeams.SondeDataContainer.ensureIndex( { "observation.Resistivity":1 } )
netbeams.SondeDataContainer.ensureIndex( { "observation.Salinity":1 } )
netbeams.SondeDataContainer.ensureIndex( { "observation.Pressure":1 } )
netbeams.SondeDataContainer.ensureIndex( { "observation.Depth":1 } )
netbeams.SondeDataContainer.ensureIndex( { "observation.pH":1 } )
netbeams.SondeDataContainer.ensureIndex( { "observation.pHmV":1 } )
netbeams.SondeDataContainer.ensureIndex( { "observation.Turbidity":1 } )
netbeams.SondeDataContainer.ensureIndex( { "observation.ODOSaturation":1 } )
netbeams.SondeDataContainer.ensureIndex( { "observation.ODO":1 } )
netbeams.SondeDataContainer.ensureIndex( { "observation.Battery":1 } )

Actually, you can verify the setup performed by accessing each of the collections of the config server. Using a client to access the server in a different shell, you can directly access and modify (NOT RECOMMENDED) the settings of the metadata server, as shown in the following listing:

marcello@netbeams-mongo-dev02:~/development/workspaces/netbeams/persistence$ mongo config
MongoDB shell version: 1.2.0
url: config
connecting to: config
type "help" for help
> Sun Dec 20 16:31:57 connection accepted from 127.0.0.1:48589 #7
show collections
Sun Dec 20 16:32:01 Request::process ns: config.system.namespaces msg id:-128400130 attempt: 0
Sun Dec 20 16:32:01 single query: config.system.namespaces  { query: {}, orderby: { name: 1.0 } }  ntoreturn: 0
chunks
databases
shards
system.indexes
version

So the method “find()” can be used to list the contents of each of the collections. An example is to list the databases configured, showing the properties of each of them (partitioned or not, server host, etc), as shown in the following listing.

> db.databases.find()
Sun Dec 20 16:47:48 Request::process ns: config.databases msg id:-128400129 attempt: 0
Sun Dec 20 16:47:48 single query: config.databases  {}  ntoreturn: 0
{ "name" : "admin", "partitioned" : false, "primary" : "localhost:10000", "_id" : ObjectId("4b2e8b3fb342bcd910b62ec7") }
{ "name" : "netbeams", "partitioned" : true, "primary" : "192.168.1.2:20002",
                  "sharded" : { "netbeams.SondeDataContainer" : { "key" : { "observation" : { "pH" : 1 } }, "unique" : false } },
                  "_id" : ObjectId("4b2e8b3fb342bcd910b62ec8") }
{ "name" : "test", "partitioned" : false, "primary" : "192.168.1.2:20002", "_id" : ObjectId("4b2e8b3fb342bcd910b62eca") }

Before proceeding, make sure you have covered the basics of mongoDB use:

Using the mongoDB client process “mongo”, access a given server “mongos” or “mongod”. The client access to “mongos” process executes the commands in the context of the entire cluster through the use of the metadata server “config”, while the “mongod” is used to access a given shard server, if necessary for debug processes. Use the commands specifying the server location and which database to use. The following listing shows the command to access a given shard on a given port, using the database “netbeams”.

marcello@netbeams-mongo-dev02:~/development/workspaces/netbeams/persistence$ mongo 192.168.1.2:20001/netbeams
MongoDB shell version: 1.2.0
url: 192.168.1.2:20001/netbeams
connecting to: 192.168.1.2:20001/netbeams
type "help" for help

In order to verify the stats of a collection, use the function “collection.stats()”. This function verifies the counters stored in the metadata server.

> db.SondeDataContainer.stats()
Sun Dec 20 14:54:24 Request::process ns: netbeams.$cmd msg id:-1701410104 attempt: 0
Sun Dec 20 14:54:24 single query: netbeams.$cmd  { collstats: "SondeDataContainer" }  ntoreturn: -1
Sun Dec 20 14:54:24 passing through unknown command: collstats { collstats: "SondeDataContainer" }
{
        "ns" : "netbeams.SondeDataContainer",
        "count" : 2364851,
        "size" : 1155567036,
        "storageSize" : 1416246240,
        "nindexes" : 40,
        "ok" : 1
}

The access of a given document is randomly chosen from one of the shards by using the function “collection.findOne()”. It is a way to verify one example of the collected data.

> db.SondeDataContainer.findOne()
Sun Dec 20 14:59:08 Request::process ns: netbeams.SondeDataContainer msg id:-1701410103 attempt: 0
Sun Dec 20 14:59:08 shard query: netbeams.SondeDataContainer  {}
Sun Dec 20 14:59:08  have to set shard version for conn: 0x2909de0 ns:netbeams.SondeDataContainer my last seq: 0  current: 4
Sun Dec 20 14:59:08     setShardVersion  192.168.1.2:20002  netbeams.SondeDataContainer  { setShardVersion: "netbeams.SondeDataContainer",
configdb: "localhost:10000", version: Timestamp 1261341503000|1, serverID: ObjId(4b2e8b3eb342bcd910b62ec6) } 0x2909de0
Sun Dec 20 14:59:08       setShardVersion success!
{
        "_id" : ObjectId("e26f40072f68234b6af3d600"),
        "message_id" : "b405e634-fd4b-450c-9466-82dc0555ea06",
        "sensor" : {
                "ip_address" : "192.168.0.178",
                "location" : {
                        "latitude" : 37.89155,
                        "longitude" : -122.4464
                }
        },
        "time" : {
                "valid" : "Sun Dec 06 2009 10:18:22 GMT-0800 (PST)",
                "transaction" : "Sat Dec 12 2009 01:52:42 GMT-0800 (PST)"
        },
        "observation" : {
                "WaterTemperature" : 23.45,
                "SpecificConductivity" : 35.4,
                "Conductivity" : 139.6,
                "Resistivity" : 899.07,
                "Salinity" : 0.02,
                "Pressure" : 0.693,
                "Depth" : 2.224,
                "pH" : 6.25,
                "pHmV" : -76,
                "Turbidity" : 0.2,
                "ODOSaturation" : 31.3,
                "ODO" : 54.83,
                "Battery" : 1.1
        }
}

MapReduce?

In order to proceed with this section, make sure you have the necessary background in the programming model “MapReduce?“. The recommended documentation and tutorials are as follows:

  • Introduction to MapReduce: this training video class describes the MapReduce? concepts using Hadoop and the Hadoop Distributed File System, which can be directly related to the mongoDB’s implementation; A Must watching before proceeding;
  • mongoDB’s MapReduce HowTo: this is the main documentation of the MapReduce? implementation and use on mongoDB. This covers the basic and how the functions “map” and “reduce” can be implemented for a given collection of documents.

The first basic example of the use of MapReduce? in distribute system is counting. In my opinion, it is a good example on how to have the counting process spread out into different machines. By using the regular client process “mongo”, access the database “netbeams”, as shown in the following listing:

marcello@netbeams-mongo-dev02:~/development/workspaces/netbeams/persistence$ mongo netbeams
MongoDB shell version: 1.2.0
url: netbeams
connecting to: netbeams
Sun Dec 20 14:22:49 connection accepted from 127.0.0.1:39899 #5
type "help" for help

At this point, you’re connected to the server running in the main host. Refer to the setup process described in the beginning of this documentation for more details. Our goal is to report the number of collected data from different servers given by the IP address of them. In this case, our strategy is to define a map function that emits the value 1 as the counter, and use a reduce function to count the consolidated result after the mongoDB’s MapReduce? engine returns the intermediary results to be reduced.

  • The Map function: The following defines the single map function that defines the key as the IP address of the sensor, and the count as the value. Note that mongoDB’s implementation differs from the Hadoop implementation. It does not include the key as a parameter to the map function, because it uses the concept of “this”, that refers to the collection object being used during the execution.
> m1 = function () {
    emit(this.sensor.ip_address, {count:1});
}
  • The Reduce function: the following defines the single reduce function that receives the consolidated results mapping the given keys (ip addresses) and the counting values found. The function iterates over the values returned and increments the total variable with the value of the variable “count”, which in this case is equals to “1″ on each of the elements. The “…” are the spaces returned from the mongoDB client shell”. The result is returned using the key “count”.
> r1 = function (key, values) {
    var total = 0;
    for (var i = 0; i < values.length; i++) {
        total += values[i].count;
    }
    return {count:total};
}

By defining each of the function “map” and “reduce”, you can use the collection function “db.collection.mapReduce”, using the function references as parameters. The following listing shows the execution of the command using the mongoDB’s shell, displaying the definition of each of the “map” and “reduce” functions before the execution:

> res = db.SondeDataContainer.mapReduce(m1, r1);
Sun Dec 20 14:26:02 Request::process ns: netbeams.$cmd msg id:-1701410106 attempt: 0
Sun Dec 20 14:26:02 single query: netbeams.$cmd  { mapreduce: "SondeDataContainer", map: function () {
    emit(this.sensor.ip_address, {count:1});
}, reduce: function (key, values) {
    var total = 0;
    for (var i = 0; i < va... }  ntoreturn: -1

After processing the execution of the function on each of the shards, the cluster head process “mongos” returns the values and consolidates the results. The output is temporarily stored in a collection called “dbres.result?“, saving the values on a separate chunk. The output is shown as follows:

Sun Dec 20 14:33:15 ~ScopedDBConnection: _conn != null
Sun Dec 20 14:33:15 creating new connection for pool to:192.168.1.2:20002
Sun Dec 20 14:33:15 ~ScopedDBConnection: _conn != null
{
        "result" : "tmp.mr.mapreduce_1261348395_10",
        "shardCounts" : {
                "192.168.1.2:20002" : {
                        "input" : 2364851,
                        "emit" : 2364851,
                        "output" : 254
                }
        },
        "counts" : {
                "emit" : 2364851,
                "input" : 2364851,
                "output" : 254
        },
        "ok" : 1,
        "timeMillis" : 433282,
        "timing" : {
                "shards" : 433193,
                "final" : 89
        },
        "ok" : 1,
}

As shown in this output, the MapReduce? result returns the number of counts of emit, input, and final output. Since there are 253 definitions of IP address being used on the network IP “192.168.1.254?” (0 – subnet address, 255 – broadcast address). The values are related to the total number of observations inserted during the Create operation. The Retrieve section shows the total number of documents as 2.36 million documents. Again, the output of the function “db.collection.stats()” shows the total number of documents:

> db.SondeDataContainer.stats()
Sun Dec 20 14:54:24 Request::process ns: netbeams.$cmd msg id:-1701410104 attempt: 0
Sun Dec 20 14:54:24 single query: netbeams.$cmd  { collstats: "SondeDataContainer" }  ntoreturn: -1
Sun Dec 20 14:54:24 passing through unknown command: collstats { collstats: "SondeDataContainer" }
{
        "ns" : "netbeams.SondeDataContainer",
        "count" : 2364851,
        "size" : 1155567036,
        "storageSize" : 1416246240,
        "nindexes" : 40,
        "ok" : 1
}

The number of “emits” is the number of total documents visited by the “map” function. The reduced is referred to the output value of the counts. In order to see the result, just access the database reference dbres.result? and use the function “find()” to list the results, as shown in the following listing, showing just 20 items from the result:

> db[res.result].find()                        
Sun Dec 20 14:34:43 Request::process ns: netbeams.tmp.mr.mapreduce_1261348395_10 msg id:-1701410105 attempt: 0
Sun Dec 20 14:34:43 single query: netbeams.tmp.mr.mapreduce_1261348395_10  {}  ntoreturn: 0
Sun Dec 20 14:34:43 creating new connection for pool to:192.168.1.2:20002
{ "_id" : "192.168.0.10", "value" : { "count" : 9408 } }
{ "_id" : "192.168.0.100", "value" : { "count" : 9371 } }
{ "_id" : "192.168.0.101", "value" : { "count" : 9408 } }
{ "_id" : "192.168.0.102", "value" : { "count" : 9500 } }
{ "_id" : "192.168.0.103", "value" : { "count" : 9363 } }
{ "_id" : "192.168.0.104", "value" : { "count" : 9355 } }
{ "_id" : "192.168.0.105", "value" : { "count" : 9281 } }
{ "_id" : "192.168.0.106", "value" : { "count" : 9320 } }
{ "_id" : "192.168.0.107", "value" : { "count" : 9341 } }
{ "_id" : "192.168.0.108", "value" : { "count" : 9464 } }
{ "_id" : "192.168.0.109", "value" : { "count" : 9285 } }
{ "_id" : "192.168.0.11", "value" : { "count" : 9201 } }
{ "_id" : "192.168.0.110", "value" : { "count" : 9397 } }
{ "_id" : "192.168.0.111", "value" : { "count" : 9258 } }
{ "_id" : "192.168.0.112", "value" : { "count" : 9242 } }
{ "_id" : "192.168.0.113", "value" : { "count" : 9231 } }
{ "_id" : "192.168.0.114", "value" : { "count" : 9446 } }
{ "_id" : "192.168.0.115", "value" : { "count" : 9550 } }
{ "_id" : "192.168.0.116", "value" : { "count" : 9409 } }
{ "_id" : "192.168.0.117", "value" : { "count" : 9256 } }
has more

Note that the final result shows the key “id” being the IP address, as defined during the “map” function, and the result is “value.count”, since “value” is the default output of the MapReduce? engine and “count” was used in the “reduce” function.

Other use cases can be performed. The execution of this map reduce was not fast because of the use of one single shard. MapReduce? is designed to perform related to the proportion of servers available. If the load is distributed in more shards, the execution result is returned in a faster way.

The shard logs reveals the details of the map and reduce operations. The following listing is from the log of the process “mongod” server, showing the instants of creation of the temporary database tables for intermediate results. First, the request is received and both the map and reduce is setup to be executed.

Sun Dec 20 14:26:02 query netbeams.$cmd ntoreturn:1 reslen:179 nscanned:0 { mapreduce: "SondeDataContainer", map: function () {
    emit(this.sensor.ip_address, {count:1});
}, reduce: function (key, values) {
    var total = 0;
    for (var i = 0; i < va..., out: "tmp.mrs.SondeDataContainer_1261347962_5" }  nreturned:1 433257ms
Sun Dec 20 14:26:02 CMD: drop netbeams.tmp.mr.mapreduce_1261347962_9
Sun Dec 20 14:26:02 CMD: drop netbeams.tmp.mr.mapreduce_1261347962_9_inc

The “map phase” is first executed, and it must be completely executed before the “reduce phase” takes place. In the scenario used to count the number of documents per IP address, it happens in different instants as shown in the following listing. In addition, it shows the process of indexing the intermediate results during the “map phase” and saves the data into the database “netbeams.tmp.mr.mapreduce_1261347962_9_inc”:

                43700/2364851   1%
                96000/2364851   4%
                148300/2364851  6%
                200300/2364851  8%
                250900/2364851  10%
                300600/2364851  12%
                351600/2364851  14%
                403800/2364851  17%
                455800/2364851  19%
                508000/2364851  21%
                560500/2364851  23%
                601100/2364851  25%
                647500/2364851  27%
                699900/2364851  29%
                752300/2364851  31%
                804300/2364851  34%
                856100/2364851  36%
                907900/2364851  38%
                959000/2364851  40%
                1009800/2364851 42%
                1060800/2364851 44%
                1112800/2364851 47%
                1164100/2364851 49%
                1209400/2364851 51%
                1253700/2364851 53%
                1305400/2364851 55%
                1350900/2364851 57%
                1401700/2364851 59%
                1453100/2364851 61%
                1503100/2364851 63%
                1551500/2364851 65%
                1602600/2364851 67%
                1637100/2364851 69%
                1687600/2364851 71%
                1736800/2364851 73%
                1787600/2364851 75%
                1839900/2364851 77%
                1891100/2364851 79%
                1941400/2364851 82%
                1989900/2364851 84%
                2041800/2364851 86%
                2094300/2364851 88%
                2145500/2364851 90%
                2193500/2364851 92%
                2245100/2364851 94%
                2296200/2364851 97%
                2341700/2364851 99%
Sun Dec 20 14:28:24 building new index on { 0: 1 } for netbeams.tmp.mr.mapreduce_1261347962_9_inc...
Sun Dec 20 14:28:24 Buildindex netbeams.tmp.mr.mapreduce_1261347962_9_inc idxNo:0
       { ns: "netbeams.tmp.mr.mapreduce_1261347962_9_inc", key: { 0: 1 }, name: "0_1" }
Sun Dec 20 14:28:40      external sort used : 0 files  in 16 secs
Sun Dec 20 14:28:46 done for 1796343 records 22.486secs
Sun Dec 20 14:28:24 insert netbeams.system.indexes 22486ms
Sun Dec 20 14:28:47 building new index on { _id: ObjId(000000000000000000000000) } for netbeams.tmp.mr.mapreduce_1261347962_9...
Sun Dec 20 14:28:47 Buildindex netbeams.tmp.mr.mapreduce_1261347962_9 idxNo:0
      { name: "_id_", ns: "netbeams.tmp.mr.mapreduce_1261347962_9", key: { _id: ObjId(000000000000000000000000) } }
Sun Dec 20 14:28:47 done for 0 records 0.02secs

The execution of the “reduce phase” stars and processes the intermediate results of the “map phase”, saving the final results in the new temporary database “netbeams.tmp.mr.mapreduce_1261348395_10″.

                100/1796343     0%
                200/1796343     0%
Sun Dec 20 14:33:15 CMD: drop netbeams.tmp.mr.mapreduce_1261347962_9_inc
Sun Dec 20 14:33:15 CMD: drop netbeams.tmp.mrs.SondeDataContainer_1261347962_5
Sun Dec 20 14:33:15 end connection 192.168.1.10:38231
Sun Dec 20 14:33:15 connection accepted from 192.168.1.10:44062 #15
Sun Dec 20 14:33:15 connection accepted from 192.168.1.2:60641 #16
Sun Dec 20 14:33:15 building new index on { _id: ObjId(000000000000000000000000) } for netbeams.tmp.mr.mapreduce_1261348395_10...
Sun Dec 20 14:33:15 Buildindex netbeams.tmp.mr.mapreduce_1261348395_10 idxNo:0
         { name: "_id_", ns: "netbeams.tmp.mr.mapreduce_1261348395_10", key: { _id: ObjId(000000000000000000000000) } }
Sun Dec 20 14:33:15 done for 0 records 0secs
Sun Dec 20 14:33:15  mapreducefinishcommand netbeams.tmp.mr.mapreduce_1261348395_10 253
Sun Dec 20 14:33:15 CMD: drop netbeams.tmp.mrs.SondeDataContainer_1261347962_5
Sun Dec 20 14:33:15 ~ScopedDBConnection: _conn != null
Sun Dec 20 14:33:15 end connection 192.168.1.2:60641
Sun Dec 20 14:33:15 end connection 192.168.1.10:44062
Sun Dec 20 14:34:43 connection accepted from 192.168.1.10:44063 #17

NOTE: If the results are important, make sure to save the temporary results into a new database system, since the results returned by a map-reduce function are purged upon new access of the server through the mongoDB client.

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: