Blog

Announcing Mapmeter: a new tool for analyzing geospatial deployments

mapmeter-logo-blackThis week at FOSS4G-North America 2013 in Minneapolis, we are excited to announce a full public beta of our new product. What we’ve previously referred to as “The Enterprise Console,” is now Mapmeter, a full administration and management tool for analyzing GeoServer systems.

Mapmeter enables organizations to monitor the health of production deployments, optimize applications during development and diagnose critical issues. With these details, administrators and managers can better — and more cost effectively — make decisions about their geospatial deployments. With Mapmeter, spatial monitoring and reporting become a primary component in your spatial IT workflow.

You may have heard us talk about this product in recent months. We started with an announcement at FedGeo, then Alyssa Wright showed a live demo with James Fee, and some of you were able to get in on a private beta. With the help and feedback of our early testers, we are now able to open up the software to all.

If you want to learn more or see a demo, please join us for the Mapmeter launch at Sponsor Day at FOSS4G-North America at 9 a.m. Friday, May 24 (look for the OpenGeo room). Sponsor Day is free, but you need to register. We’ll also be holding office hours at our FOSS4G North America booth today (5/22/2013) from 2:00pm to 3:00pm.

If you won’t be able to join us in Minneapolis, head over to http://mapmeter.com to learn more about these exciting new features, connect with the development team and join us for this beta program.

We hope you’re as excited as we are about Mapmeter! If you have questions, please contact us. You can also contact the Mapmeter team directly for personalized help with the public beta.

New Job Postings

hiringOpenGeo is looking for talented people to join our team. We offer interesting technical work, competitive salaries, great benefits, and a fantastic working environment. Most importantly we challenge our employees to build the best open source and interoperable tools for spatial data on the web. We added a few new posts this week, if any look like a fit for you, please apply!

Here’s a list of our open positions:

UX Developer -  We’re seeking a talented user experience developer to design and implement creative user interfaces for our innovative open source geospatial software.

Support Manager -  OpenGeo is looking for a support manager to ensure that customers large and small are familiarized with our software, properly trained in its function, and supported if anything should go wrong. The ability to think quickly and communicate clearly in a fast-paced environment is essential. Enthusiastic problem-solving skills and a desire to be engaged at all levels of a problem are even better.

Software Project Manager -  OpenGeo is seeking a skilled Software Project Manager to help us bring open source software to governments, commercial enterprises, NGOs, and other organizations around the world.

Java Developer - OpenGeo is seeking skilled software engineers interested in helping us bring open source software to organizations around the world. Our team improves the open source components underlying the OpenGeo Suite, allowing a wide variety of customers to share and edit data using open standards.

Front End Developer -  We’re looking for someone who is ready to work with peers in design and engineering to create pixel-perfect interfaces across a range of projects and products. You’ll own the code-base, work on the hard problems, build your ideas into reality, and help determine best practices throughout our organization.

Sales Account Manager – Our current (and future) clients are looking to open source to solve their spatial IT needs. Our account managers help commercial enterprises and federal clients use our innovative, open source geospatial software as efficiently and effectively as possible, allowing them to get more than ever out of their geospatial instances.

Here’s the full list, please apply and/or spread the word!

OpenGeo Emerges

This week OpenGeo took an exciting and important step forward as an organization. We’ve taken on investment and spun out from OpenPlans, our long time parent organization, to establish ourselves as an independent company. Our growth and this successful step out on our own are the result of our amazing team and the success of open source geospatial software that we’ve been working on for over ten years.

Vanedge Capital, a Vancouver-based venture capital firm, led the Series A round of investment that made this possible. We are truly excited to begin a partnership with Vanedge, an innovative fund led by partners who know how to grow and manage software technology companies.

This investment provides the capital we need to meet our objectives and continue to develop innovative technologies. If you’re a regular reader of this blog or have seen us lately at conferences or events, you know about the ambitious projects we’ve been working on: through-the-web-processing, breaking out of the GIS work-flow with Spatial IT, geospatial web-analytics and distributed versioning for geospatial – to name a few. This type of development requires not just the strong technical skills and forward-looking leadership that our team has, but it also requires resources, which Vanedge’s investment provides.

This investment also allows us to achieve our long-planned separation from OpenPlans, which founded and incubated us. We are grateful for the support and vision of OpenPlans over the years. And, since OpenPlans remains an investor in our new company, we’re looking forward to our continued partnership with them.

Our mission remains the same: to build the highest quality software for location and mapping, available to all. This investment gives us a stronger base of resources to support the open source communities we work with. We remain committed to the open source principles of collaboration, transparency, and freedom. We’ll be doing even more to develop the best geospatial tools while supporting the open source communities and our customers alike.

Look for more from us about the future of Spatial IT and how we can help you get there.

Find out more about this important step forward and please contact us if you have any questions.

GeoServer in a clustered configuration (part 2)

In our last post on clustering, we talked about the theory behind some different options for clustering. In this post, we’ll go into an example of clustering, taken from our recent experience with one of our OpenGeo Suite Enterprise clients. If you’ll be attending FOSS4G-NA and want to learn more about clustering and GeoServer consider attending our GeoServer training and Juan Marin’s GeoServer in Production presentation (scheduled for 5/23/2013 at 11:30 am).

Clustering Scenario

In this following scenario, we will work through the installation and configuration of two GeoServers each inside their own servlet container instances on the same machine. Each servlet container will use the same JRE and the same container binaries (Apache Tomcat 7), but they will have independent configurations that allow them to run on different ports. These two GeoServer/Tomcat instances will be fronted by a local software proxy called HAProxy which acts as a HTTP/TCP load balancer. Load balancer configurations provide very basic “round robin” balancing of GeoServers. More sophisticated load-balancing configurations are possible, but are beyond the scope of this example. All GeoServers will be deployed as WAR files placed into each of the Tomcat webapps directories. It is possible to have multiple instances of Tomcat share a single web-application through the use of contexts. This is useful if you anticipate your web-application (GeoServer) will be changed/updated frequently, but isn’t necessary. Read the rest of this entry »

Alpha releases

openlayers3

One thing I love about open source development is the ‘alpha’ release.

Last week was an exciting week of alphas for OpenGeo, both OpenLayers 3.0 and GeoGit had their first releases and launched new websites. The two websites are admittedly not very sophisticated—I made the geogit.org with GitHub’s page generator and Andreas pulled together ol3js.org with Bootstrap—but awesome websites can come later. The point of these alpha releases is to get something out in the world and widen the open source process to new users and potential contributors.

Alpha releases are rarely seen in proprietary software development since software in an alpha state is generally quite buggy. To quote Wikipedia: ”alpha software can be unstable and could cause crashes or data loss.” At this point many would turn away and run as far from the software as possible but to me it’s an awesome thing, an understood pact between the developers and the users that says: “hey, we’re not perfect, and we know our software is far from perfect, but if you understand the risks we’d be really excited to show it to you.”

The process opens up a dialog of equals—not the typical consumer relationship, but a collaborative one. The user of alpha software actually has a responsibility to communicate when (not if) things go wrong and to tell the developers how it crashes, what important option isn’t there, how the installation fails, or even how the website is confusing. In this way, responsibility can grow from being an alpha user to include helping with documentation, improving the website, debugging problems, contributing patches, and eventually building major new features as a core developer. Indeed the point of the alpha release is to put a stake in the ground and open the process to gain feedback from others, allowing users and developers to build the future together. Everyone is expected to be a true participant, in the fullest sense of the word, with responsibilities as well as privileges as opposed to just a passive consumer.

We encourage you to check out both the OpenLayers 3.0 and GeoGit alpha releases and let the teams know what you think. OpenLayers in particular has a very solid core but is looking for practical input from real users. We think the projects show a lot of potential, and we’re excited for your feedback, encouragement, and even contributions. Don’t hesitate to jump in and join us as we build the geospatial future together.

GeoServer in a clustered configuration (part 1)

Recently, we helped one of our clients who wanted to set up a GeoServer cluster. There are different ways to accomplish clustering depending on your specific needs, but we thought it would be illustrative to show what we did in this particular situation. Keep in mind this is a specific treatment and fairly tailored. We encourage you all to experiment with the newest features, but remember to do so in your testing environment!

We’ll start with some clustering theory and tips before launching into the actual details of how to do it.

Background

A computing cluster consists of two or more machines working together to provide a higher level of availability, reliability, and scalability than can be obtained from a single node. Nodes in a cluster are positioned behind a proxy server and/or load balancer that delegates requests to cluster members based on any one member’s ability/availability to handle load.

Clustering

Clustering

Similar to other applications with long-running in-memory states and high data I/O, GeoServer sees performance gains with two (or more) nodes clustered behind a load balancer—even with the slight overhead of the load balancer that sits in front of the cluster.

Generally, there are two complementary purposes for clustering GeoServer:

  • To provide high-performance and/or throughput
  • To achieve high availability

In the most demanding situations, GeoServer can be deployed in combinations of high-performance and high-availability instances.

High-Performance Clusters

A high-performance GeoServer configuration deploys several instances of GeoServer on a single machine.

High-performance cluster

High-performance cluster

Each GeoServer instance is deployed into its own servlet container (Tomcat, Jetty, etc.). Individual servlet containers are configured independently and spin up their own JVM, each with it’s own memory and processor allocations (borrowed from the pool of resources on the host machine). GeoServer’s memory and CPU runtime footprint are optimized for high throughput under heavy concurrency with such a deployment, but always consider that these different deployed units will compete for the physical server’s resources. To find the best balance we recommend, as always, to test for your particular scenario.

A load balancer or proxy fronts the cluster, and directs traffic to the member of the cluster most able to handle the current request. In this case, nodes will likely share the same server name or IP address, but listen for requests on different ports. For example:

Load Balancer @ http://<server>:80/<alias> forwards to one of:

  • GeoServer 1 in Tomcat 1 @ http://<server>/geoserver:8081
  • GeoServer 2 in Tomcat 2 @ http://<server>/geoserver:8082
  • GeoServer 3 in Tomcat 3 @ http://<server>/geoserver:8083
  • GeoServer 4 in Tomcat 4 @ http://<server>/geoserver:8084

An approach that deploys multiple instances of GeoServer into the same servlet container is not recommended. In this case, since host resource allocation (to a common JVM) will not be sequestered as neatly, competition for those resources will occur, limiting the benefits.

Users might also consider using the built-in clustering capabilities found in Enterprise Application Servers (such as Oracle Weblogic or JBoss), however this is beyond the scope of this discussion.

High-Availability Clusters

A high-availability implementation will spread several GeoServer instances across several machines (nodes) in a cluster. These nodes can be physical or virtual machines.

High-availability cluster

High-availability cluster

Nodes are normally located behind a load balancer that redirects traffic to any single GeoServer based on traffic volume and availability. In this case, nodes will likely be on different servers or IP addresses and listen for requests on the same port. For example:

Load Balancer @ http://<server>:80/<alias> forwards to one of:

  • GeoServer 1 in Tomcat 1 @ http://<server1>/geoserver:8080
  • GeoServer 1 in Tomcat 1 @ http://<server2>/geoserver:8080
  • GeoServer 1 in Tomcat 1 @ http://<server3>/geoserver:8080

Data directory location and catalog reloads

Some important considerations to be made when clustering several instances of GeoServer concern the location of the GeoServer data directory and a strategy for reloading all cluster members’ data catalogs.

The GeoServer data directory is the location in the file system where GeoServer stores its configuration information. The configuration defines things such as what data is served by GeoServer, where it is stored, and how services such as WFS and WMS interact with and serve the data. The data directory also contains a number of support files used by GeoServer for various purposes.

The spatial data accessed by GeoServer doesn’t need to reside within the GeoServer data directory, just pointers to the data locations. This should be obvious for data stored in spatial databases, which are certainly in different locations (on disk) and often on different machines; however the same is true for file-based spatial data. (Read more about the GeoServer data directory.)

GeoServer’s catalog is an in-memory representation of the configurations in the data directory. Storing the configurations in memory means that GeoServer can access this information faster than by reading these instructions off disk. However, this sometimes requires that the in-memory catalog be refreshed when configurations changes are made to the disk-based GeoServer data directory, or to the actual data served in GeoServer.

Unless catalog (re)configurations are largely static, or some amount of catalog discrepancy or availability is acceptable, a common GeoServer data directory location for all clustered instances is highly recommended.

The location of the GeoServer data directory is stored in the GEOSERVER_DATA_DIR variable. It can be configured in one of three ways: in each instance’s web.xml file (/webapps/geoserver/WEB-INF), through a common environment variable, or through a parameter passed to the JVM in the container start-up command.

Some implementations have clustered GeoServer instances using separate data directories that are synchronized manually (low change frequency) and automatically (using rsync), but neither approach is as common or recommended as a shared data directory.

Regardless of the mechanism for synchronization, changes to the data directory and the in-memory catalog will normally be directed by one master GeoServer. This can be enforced by disabling the GeoServer user interface on all “slave” GeoServers or by configuring the front-end load balancer to only direct user interface requests to /geoserver/web to the master GeoServer.

Changes to the master GeoServer’s data catalog must be explicitly refreshed on slave instances. This can be accomplished manually through the GeoServer Admin web UI (/geoserver/web), or with some measure of automation (on a schedule, or after a trigger is fired) using GeoServer’s REST API (e.g. by sending a POST/PUT request to /geoserver/rest/reload?recurse=true).

Clustering Enhancements

Enhancements to our clustering story are coming! Specifically, in future releases of GeoServer the data directory will have the option to be database-backed. This means that a central configuration store can be queried more optimally than a file-based counterpart and doesn’t all need to be read into memory.

In the next post, we’ll go into the details on setting up a clustered instance. Remember, Enterprise: Platform clients and higher get custom clustering and deployment advice included in their maintenance agreements.

Have you been looking at deploying GeoServer in a clustered environment? Tell us about it!

Why We Sprint

I spent last week in Boston, attending an annual code sprint for C-based open source geospatial projects.  I’ve been doing this every year since 2008.  Since getting back, I’ve had to explain the event to several people, technical and non-technical, since the concept isn’t obvious at all.

p3

Open source development of characterized by some features that differ a great deal from traditional work environments:

  • the developers work asynchronously, often in different time zones, usually in different locations,
  • the developers coordinate exclusively using text tools, like e-mail, issue tracking systems, and sometimes instant messaging

Because there is no need to be in the same space with other developers, either physically or even temporally, the barriers to entry to a project are lowered. More people can participate than otherwise.

p1

However, there are disadvantages to working asynchronously and with text communications.

  • asking for help when you get stuck can be time consuming, because your colleagues might be asleep at the moment when help would be most useful
  • issues of subtlety or complexity take a great deal of text to describe, and any misunderstandings on the part of a reader take even more text to correct
  • discussion of emotional issues can lead to conflict due to the limited emotional nuance in text communication

A code sprint is a chance to work for a time with your open source colleagues “the old fashioned way”, face to face, on the same clock.

p2

Because everyone is together, and communications are high-bandwidth and high-fidelity, a code sprint is a great time for:

  • planning and designing large scale changes to the code
  • designing new APIs or new user interfaces, and
  • triaging ticket lists to prepare for release

I usually spend the first half of a sprint on communication-heavy tasks like the ones above. The second half I usually spend heads down on a hard piece of code.

If the right experts are around, code sprints are an excellent time to attack a new piece of code you don’t quite understand. Learning how a module works from the expert who wrote it is far faster than doing it alone at home.

And finally, having lunch and dinner and socializing usually provide the social space for unexpected topics to slip out and get a discussion, whether they be uncomfortable issues like dealing with a difficult team member or just a crazy feature idea that turns out to be not so crazy at all when discussed with the group.

If you have a chance to participate in a code sprint on a project you contribute to, don’t pass it up!

OpenGeo at FOSS4G North America, 2013

We always look forward to opportunities to get together with our friends, colleagues and clients to discuss what’s new in geospatial technology. The FOSS4G conferences of recent years have consistently offered the best opportunity to do just that; that’s one reason why we’re so excited about this year’s FOSS4G North America conference.

Last year we had a pretty big hand in organizing the inaugural DC conference, the event went so well that it’s become an annual event. This year we’ve stepped back from the day-to-day planning but we’re still helping out on the program committee, and we’re happy to support the conference as gold sponsors.

We’ll be sending a pretty large contingent to Minneapolis; nine OpenGeo presentations were accepted and we’ll be teaching four workshops. We’re looking forward to an exciting (and busy!) week and hope to see you there. Don’t forget to register to attend; we hear spots are filling up quickly. And if you’ll be in Minneapolis make sure to come by our exhibition table, you never know who you’ll run into.

Interested in hearing us speak? Want to enroll in a workshop? The preliminary program has been announced and the times and dates of specific talks will be posted soon. Scroll down to see list of what we’ll be up to at FOSS4G-NA:

OpenGeo FOSS4G-NA Presentations (Marriott City Center in Minneapolis, MN)

  • GeoServer CSS: David Winslow
  • Say Hello to OpenLayers 3: Tim Schaub & Eric Lemoine of Camptocamp
  • OpenLayers 3: Vectors Redux: Tim Schaub & Andreas Hocevar
  • Scripting GeoServer with GeoScript: Tim Schaub
  • LIDAR in PostgreSQL with PointCloud: Paul Ramsey
  • GeoServer in Production: Juan Marin
  • State of GeoServer: Justin Deoliveira
  • PostGIS Feature Frenzy: Paul Ramsey
  • Diversity in FOSS4G Mailing List: An Analysis: Alyssa Wright & Georgia Bullen of the Open Technology Institute

OpenGeo FOSS4G-NA Workshops (University of Minnesota in Minneapolis, MN)

GeoScript in Action: Part One

This is the first post of a three-part series dedicated to showing the versatility and functionality of GeoScript. 

The rumors are true: GeoScript is pretty awesome. How awesome? We’ll let you be the judge. In this post we’ll focus on data exploration and show you how to create a few visualizations right from the GeoScript command line, but GeoScript can do much more than that. In subsequent posts we’ll build up enough code and data to create other GeoScript products, then we’ll show how to easily refactor this code into processing web services. If you’d like to follow along with this post please install GeoScript. For the purposes of these exercises make sure that you download the latest version. For our examples we’ll be using the Python flavor of GeoScript. Make sure to keep the API doc close at hand and don’t hesitate to experiment. If you get stuck, please contact us and we’ll try to help you out.

Getting the Source Data. We’ll be using solar resource data from the National Renewable Energy Lab. The data set contains direct normal irradiance (DNI) average values for the contiguous United States and Hawaii. If you want to find out more details should be contained within the metadata. You can download the entire data set to work through the examples below. Unzip to a directory, navigate to it, and fire up the GeoScript shell. (If you see some diagnostic messages along the way, just ignore them and remember that GeoScript is a work in progress.) Let’s get started!

Loading and Exploring Data. First, let’s load our two shapefiles into GeoServer:

>>> from geoscript.layer import Shapefile
>>> solar_all_poly = Shapefile("solar_dni_polygons.shp")
>>> states = Shapefile("usa_l48.shp")

Read the rest of this entry »

We’re (still) hiring!

hiringOpenGeo is always looking for talented people to join our team. We offer interesting technical work, competitive salaries, great benefits, and a fantastic working environment. Most importantly we challenge our employees to build the best open source and interoperable tools for spatial data on the web.

Here are a few of our openings:

Project Manager -  OpenGeo seeks someone with the firmness of an Army General and the tenderness of a Little League coach to help manage our developers on client projects. If “GET IT DONE” is your catchphrase, if you are a multitasking ninja, if you had your own Checklist Manifesto long before Atul Gawande put pen to paper, please apply!

Front End Developer -  We’re looking for someone who is ready to work with peers in design and engineering to create pixel-perfect interfaces across a range of projects and products. You’ll own the code-base, work on the hard problems, build your ideas into reality, and help determine best practices throughout our organization.

Senior Inside Sales Manager – The biggest barrier to our sales growth is sales capacity. While that’s a nice problem to have, it’s still a problem! This is a great career opportunity for a seasoned inside salesperson.

Sales Account Manager – Our current (and future) clients are looking to open source to solve their spatial IT needs. Our account managers help commercial enterprises and federal clients use our innovative, open source geospatial software as efficiently and effectively as possible, allowing them to get more than ever out of their geospatial instances.

Here’s the full list, please apply and/or spread the word!