当ManyToMany和ForeignKey关系时,批量插入SQLite DB

时间:2017-07-16 09:14:51

标签: python sql django

我有一个包含ManytoMany关系和ForeignKeys的数据库表结构。我正在从IEEE Xplore网站上进行网络搜索,以便为研究人员建立一个出版物数据库。网络抓取可以返回批量数据(最多1000个出版物)。我已经启动并运行了程序,我从网站成功提取数据,并将其写入数据库。但由于这些关系,我需要保存每一行,然后添加多对多元素。这使得写入非常慢(几分钟)。我读到了关于bulk_create和原子装饰器的文章,但文档说他们不会使用多对多关系。哪种有意义,因为在添加多对多关系之前需要保存每条记录。唯一的方法似乎是进行原始SQL插入 - 进入临时表然后合并到主表中。数据结构如下。在我开始找到SQL答案之前,我想我会发帖以检查是否有其他出路或任何其他文件或建议。

提前致谢。

from django.db import models
from django import forms
from django.forms import ModelForm, Textarea

# Create your models here.

class Journal(models.Model):
    name = models.CharField(max_length = 100)
    organization = models.CharField(max_length = 100, blank = True)
    issn_number = models.CharField(max_length=50, blank=True)
    pub_type = models.CharField(max_length=100, blank=True)

    def __unicode__(self):
        return self.name


class Author(models.Model):
    first_name = models.CharField(max_length = 20, blank = True)
    last_name = models.CharField(max_length = 20, blank = True)
    middle_name = models.CharField(max_length = 20, blank = True)
    full_name = models.CharField(max_length = 50)
    email = models.EmailField(blank = True)

    def __unicode__(self):
        return self.full_name


class Paper(models.Model):
    paper_title = models.CharField(max_length=200)
    paper_year = models.IntegerField(blank = True, null = True)
    paper_volume = models.IntegerField(blank = True, null = True)
    paper_issue = models.IntegerField(blank = True, null = True)
    paper_number = models.CharField(max_length = 100, blank = True, null = True)
    paper_pages = models.CharField(max_length = 100, blank = True, null = True)
    paper_month = models.CharField(max_length = 15, blank = True, null = True)
    paper_doi = models.CharField(max_length = 50, blank = True, null = True)
    paper_abstract = models.TextField(blank = True, null = True)
    paper_keywords = models.TextField(blank = True, null = True)
    paper_journal = models.ForeignKey(Journal)
    paper_authors = models.ManyToManyField(Author, through = 'Contributor')
    paper_arnumber = models.CharField(max_length = 20, blank=True, null=True, \
                                    verbose_name="Article number")
    paper_url = models.URLField(blank=True, null=True, verbose_name="Paper URL")
    paper_pdflink = models.URLField(blank=True, null=True, verbose_name="PDF download link")

    def __unicode__(self):
        return self.paper_title


class Contributor(models.Model):
    author = models.ForeignKey(Author)
    paper = models.ForeignKey(Paper)
    position = models.IntegerField(default = 0)

    def __unicode__(self):
        return self.author.full_name + " wrote " + \
                self.paper.paper_title + " as " + \
                str(self.position) + " author"


class Institution(models.Model):
    name = models.CharField(max_length=200)

    def __unicode__(self):
        return self.name


class Affiliation(models.Model):
    institution = models.ForeignKey(Institution)
    author = models.ForeignKey(Author)
    year = models.IntegerField(blank=True, null=True)

    def __unicode__(self):
        return self.author.full_name + " was associated with " + \
                self.institution.name + " in the year " + \
                str(self.year)

0 个答案:

没有答案