| Pluto | ||
| Pluto - Technical details | ||
|
Pluto home |
Features |
Download |
Latest additions
The diagram on the right shows a very high level structure of Pluto's internals. This is an over simplified diagram but proves the point that Pluto is based on well proven architecture comprising of layers and tiers and an unwavering focus on reuse. The design of Pluto has also evolved with the realization that within a subnet, Pluto is most economically operated in a client server architecture, where the server comprises of the persistent layer and data fetching logic. Each deployment of Pluto's server is going to require GBs of data space to store the end of day values of stocks for the last 20 years (or more), not to mention the network bandwidth requirement. It is estimated that on an average Pluto downloads around 70 MB of data. Keeping this in mind, the communitation between the client and the server has been kept extremely loosely coupled which will facilitate deploying only the client as a separate application, applet or for that matter a WebStart application. Spring is used heavily to achieve the decoupling in terms of interface based dependency injection. Pluto's data capture algorithms heavily rely upon screen scraping technologies. At present the scaping is inbuilt using custom string manipulation code, which is not quite resilient towards HTML layout changes in the page. However, the data extraction logic from the scraped contents is well isolated. The intention is to replace the custom logic with WebHarvest and Solvent at a later date. WebHarvest and Solvent use XQuery extensively on the HTML DOM, making it easy to extract data and resilient to string changes in the HTML. An integral part of Pluto's internals is a job subsystem. As I have mentioned earlier, pluto is designed to be autonomous and functioning with minimal user intervention. It is for this reason that Pluto resides as a background daemon (in Windows as an operating system service). To function on it's own, Pluto relies on a barrage of cron triggered jobs, which get triggered at predefined intervals. The set of jobs include logic to fetch end of day data, intra day data, archival of old records, checking the network status etc. The cron triggering functionality is implemented leveraging Quartz. I will be penning more details on this page iteratively over a period of time. In the interim, if you are genuinely interested to learn more and/or contribute to Pluto, please get it touch. I would also love to hear any technical views you might have on making Pluto better. You can always reach me through my guest book.
|
||
| ©2007 Sandeep Deb
All Rights Reserved. Best viewed with Internet Explorer and a screen resolution of 1024 X 768 |
||