Hi, I've been working on a php script that read all battles from the Card Hunter API and send them to handler functions. The idea is to run this regularly via a cron job, which would ensure that each battle is sent over to the handlers once and only once. The handler functions can then do things like count wins, respond to matches between rivals, notify you when particular players are online, or even record wins for each side on particular maps to see whether they're balanced. There's huge scope for what this could do, as evidenced by me making half of those things up as I typed them. I'm pretty excited about it. So, the point of this thread is to share what I've built thus far. I have very little php experience, and my MySQL-fu is weak, so by all means please suggest changes and improvements. Nonetheless, this does seem to work and I figured might be a good starting point for anyone wanting to run something like it. Without further ado, here are the files: PHP: <?php include 'config.php'; // set up battle handlers$battleHandlers = array();include 'echobattlehandler.php'; // hook up to the db$dbConn = mysqli_connect( DB_IP . ":" . DB_PORT, DB_USER, DB_PASSWORD, DB_DEFAULT_DATABASE );if( $dbConn->connect_errno > 0 ){ die( "Database connection error: " . $dbConn->connect_error );} // grab the last battle id// or -1 if we haven't run before$result = $dbConn->query( "SELECT * FROM `last_battle_id`;" );if( $result == false ){ die( "Error in battle id query: " . $dbConn->error );}$lastID = -1;if( $result->num_rows == 1 ){ $row = $result->fetch_assoc(); $lastID = $row["battle_id"];}else if( $result->num_rows > 1 ){ die( "Too many rows returned by last battle id query" );}$result->close(); // set up battle id in db if not there alreadyif( $lastID == -1 ){ $dbConn->query( "INSERT INTO `last_battle_id` VALUES ( -1 );" );} $battlesRemain = true;$battlesRequestCount = 0;$curl = curl_init();curl_setopt( $curl, CURLOPT_RETURNTRANSFER, 1 );while( $battlesRemain && $battlesRequestCount < MAX_BATTLES_REQUESTS ){ // assmeble url $url = "api.cardhunter.com/battles?count=" . REQUEST_DATA_COUNT; if( $lastID >= 0 ) { // if we know where to start from, start from there. Otherwise defaults to the end. // use the demarc page system to ensure we start from the next battle after lastID. $url .= "&page=prev&demarc=" . $lastID; } // grab data from cardhunter api service curl_setopt( $curl, CURLOPT_URL, $url ); $battles = json_decode( curl_exec( $curl ) ); // did we get any battles? if( property_exists( $battles, "battles" ) ) { // flip the battles so they're ordered oldest-to-newest $battles->battles = array_reverse( $battles->battles ); // handle each battle $count = 0; foreach( $battles->battles as $battle ) { // reformat time string so they can be used with MySQL DATETIME $battle->start = substr( $battle->start, 0, 10 ) . " " . substr( $battle->start, 11, 8 ); // run each handler for it foreach( $battleHandlers as $battleHandler ) { call_user_func( $battleHandler, $battle ); } $lastID = max( $lastID, $battle->id ); ++$count; } if( $count < REQUEST_DATA_COUNT ) { // well, that's all of 'em $battlesRemain = false; } } else { // no battles array. What happened? echo "Error: No battles object in data"; var_dump( $battles ); $battlesRemain = false; } ++$battlesRequestCount;} // store last ID$statement = $dbConn->prepare( "UPDATE `last_battle_id` SET `battle_id`=?;" );$statement->bind_param( "s", $lastID );$statement->execute();$statement->close(); // tidy upcurl_close( $curl );$dbConn->close(); This is the main pump. I ended up using the undocumented "demarc" API parameter, which is used to paginate the data for large requests. This allowed me to get the next batch of battles from a particular point onward, which unfortunately didn't seem possible using the "before" and "after" parameters since they always started at the most recent end of the list. PHP: <?php// generaldefine( "REQUEST_DATA_COUNT", 25 ); // maximum number of battles in each requestdefine( "MAX_BATTLES_REQUESTS", 10 ); // maximum number of times to request battles // database configdefine( "DB_IP", '127.0.0.1' );define( "DB_PORT", '3306' );define( "DB_USER", 'your db user name here' );define( "DB_PASSWORD", 'your db user's password here);define( "DB_DEFAULT_DATABASE", "cardhunter_api_playground" );?> This is an example of config.php. It contains options for the battle pump. Splitting this out into its own file should make it easier to maintain live and development versions of the code, I think. Will see. Obviously your own details are used in the db fields. PHP: <?php // define and add main handlerfunction handleBattle( $battle ){ var_dump( $battle );}$battleHandlers[] = "handleBattle"; ?> This is echobattlehandler.php. It's just a sample handler that dumps the battles out on screen. You'll want to write your own handlers to do clever things with the battle data. Code: CREATE SCHEMA IF NOT EXISTS `cardhunter_api_playground` DEFAULT CHARACTER SET latin1 ; USE `cardhunter_api_playground` ; -- ----------------------------------------------------- -- Table `cardhunter_api_playground`.`last_battle_id` -- ----------------------------------------------------- CREATE TABLE IF NOT EXISTS `cardhunter_api_playground`.`last_battle_id` ( `battle_id` BIGINT(20) NOT NULL , PRIMARY KEY (`battle_id`) ) ENGINE = InnoDB DEFAULT CHARACTER SET = latin1; This chunk of SQL sets up your db to store the most recently accessed battle id. I recommend emptying this table if you haven't run the pump for a while, as that'll instruct the scraper to just start from the most recent set of battles. Obviously if you want to change the db name you'll need to do this both here and in config.php. So, uh, yeah. That's it so far. I have a handler which collects more interesting data, but I still need to write some new php pages to browse that data for it to be at all useful. I'm also considering attempting to run the scraper every time the site I'm building is accessed, but then throttling the scraper so it runs no more often than once a minute. Between that and 15 minute cron scheduling I think I'll get a good balance of load / timeliness. If I could just schedule it to run every minute I'd do that instead, however my host doesn't support it. Anyhow, since this is currently reasonably simple and self contained I figured I'd share it, since from here on out my dev copy is going to become less and less generically useful. Enjoy!