/soc/2012/sanket/www-statscollector: ff54fe736a73: Add function ...

Sanket Agarwal sanket at soc.pidgin.im
Tue Aug 7 15:22:47 EDT 2012


Changeset: ff54fe736a73186b3c465a9c2d8e694f2612df75
Author:	 Sanket Agarwal <sanket at soc.pidgin.im>
Date:	 2012-08-07 11:21 -0400
Branch:	 default
URL: http://hg.pidgin.im/soc/2012/sanket/www-statscollector/rev/ff54fe736a73

Description:

Add function to process files in folder into RawXML

There might be a need when deploying the database,
to process XML filest that we already have in store.
I have created a small function which takes the folder
and process the files into RawXML database objects,
they can then be processed with processAllRawXML

diffstat:

 pidgin_stats_collector/statscollector/process.py |  28 +++++++++++++++++++++++-
 1 files changed, 27 insertions(+), 1 deletions(-)

diffs (45 lines):

diff --git a/pidgin_stats_collector/statscollector/process.py b/pidgin_stats_collector/statscollector/process.py
--- a/pidgin_stats_collector/statscollector/process.py
+++ b/pidgin_stats_collector/statscollector/process.py
@@ -8,6 +8,10 @@ from statscollector.constants import *
 from lxml import etree
 import pdb
 import re
+from django.core.files.temp import NamedTemporaryFile
+from django.core.files import File
+import random
+import glob
 
 class Process:
 
@@ -218,7 +222,29 @@ class Process:
     if not Info.objects.filter(raw_xml = this.raw_xml):
       langs = this.getInfo().save()
 
-def processAllRawXML():
+def processFilesInFolder(folder):
+  """ Processes all the files in a given folder to RawXML objects """
+  files = glob.glob(folder+'/*')
+  for f in files:
+    string = file(f).read()
+
+    # Create a file object from this data
+    temp = NamedTemporaryFile(delete=True)
+    temp.write(string)
+    temp.flush()
+
+    raw_sub = RawXML()
+    raw_sub.hash_id = hex(random.getrandbits(128))[2:-1]
+    raw_sub.file_id = hex(random.getrandbits(128))[2:-1]
+    raw_sub.stats_xml.save(raw_sub.file_id, File(temp))
+    raw_sub.save()
+
+    # Rip the XML apart for various bits of information
+    Process(raw_sub).process()
+
+
+
+def processAllRawXML(load=True):
 
   """ Helper function to process all RAW XML files currently
   in store. It is advisable to flush all tables before



More information about the Commits mailing list