如何在Linux中分配大的连续内存区域

时间:2019-06-08 16:21:00

标签: linux-kernel allocation dma contiguous

是的,最终我将在DMA中使用它,但是暂时不考虑一致性。我有64位BAR寄存器,因此AFAIK的所有RAM(例如高于4G)都可用于DMA。

我正在寻找大约64MB的连续RAM。是的,很多。

Ubuntu 16和18的CONFIG_CMA = y,但是在内核编译时未设置CONFIG_DMA_CMA。

我注意到,如果同时设置了两者(在内核构建时),我可以简单地调用dma_alloc_coherent,但是由于后勤原因,不希望重新编译内核。

这些机器将始终至少具有32GB的RAM,不运行任何占用大量RAM的内存,并且启动后不久,内核模块将在RAM显着碎片化之前加载,并且AFAIK,其他都没有使用CMA。

我已经设置了内核参数CMA = 1G。 (并尝试过256M和512M)

# dmesg | grep cma
[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.4.170 root=UUID=2b25933c-e10c-4833-b5b2-92e9d3a33fec ro cma=1G
[    0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-4.4.170 root=UUID=2b25933c-e10c-4833-b5b2-92e9d3a33fec ro cma=1G
[    0.000000] Memory: 65612056K/67073924K available (8604K kernel code, 1332K rwdata, 3972K rodata, 1484K init, 1316K bss, 1461868K reserved, 0K cma-reserved)

我尝试了alloc_pages(GFP_KERNEL | __GFP_HIGHMEM,顺序),没什么好玩的。

最后是一个实际的问题:如何从CMA获得大的连续块?我在网上找到的所有内容都建议使用dma_alloc_coherent,但是我知道这仅适用于CONFIG_CMA = y和CONFIG_DMA_CMA = yes。

模块源tim.c

#include <linux/module.h>       /* Needed by all modules */
#include <linux/kernel.h>       /* Needed for KERN_INFO */
#include <linux/init.h>
#include <linux/mm.h>
#include <linux/gfp.h>
unsigned long big;
const int order = 15;
static int __init tim_init(void)
{
        printk(KERN_INFO "Hello Tim!\n");
        big = __get_free_pages(GFP_KERNEL | __GFP_HIGHMEM, order);
        printk(KERN_NOTICE "big = %lx\n", big);
        if (!big)
                return -EIO; // AT&T

        return 0; // success
}

static void __exit tim_exit(void)
{
        free_pages(big, order);
        printk(KERN_INFO "Tim says, Goodbye world\n");
}

module_init(tim_init);
module_exit(tim_exit);
MODULE_LICENSE("GPL");

插入模块会产生...

# insmod tim.ko
insmod: ERROR: could not insert module tim.ko: Input/output error
# dmesg | tail -n 33

[  176.137053] Hello Tim!
[  176.137056] ------------[ cut here ]------------
[  176.137062] WARNING: CPU: 4 PID: 2829 at mm/page_alloc.c:3198 __alloc_pages_nodemask+0xd14/0xe00()
[  176.137063] Modules linked in: tim(OE+) xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables configfs vxlan ip6_udp_tunnel udp_tunnel uio pf_ring(OE) x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm mei_me mei irqbypass sb_edac ioatdma edac_core shpchp serio_raw input_leds lpc_ich dca acpi_pad 8250_fintek mac_hid ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear
[  176.137094]  hid_generic usbhid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel e1000e aesni_intel raid1 aes_x86_64 isci lrw libsas ahci gf128mul ptp glue_helper ablk_helper cryptd psmouse hid libahci scsi_transport_sas pps_core wmi fjes
[  176.137105] CPU: 4 PID: 2829 Comm: insmod Tainted: G           OE   4.4.170 #1
[  176.137106] Hardware name: Supermicro X9SRL-F/X9SRL-F, BIOS 3.3 11/13/2018
[  176.137108]  0000000000000286 8ba89d23429d5749 ffff88100f5cba90 ffffffff8140a061
[  176.137110]  0000000000000000 ffffffff81cd89dd ffff88100f5cbac8 ffffffff810852d2
[  176.137112]  ffffffff821da620 0000000000000000 000000000000000f 000000000000000f
[  176.137113] Call Trace:
[  176.137118]  [<ffffffff8140a061>] dump_stack+0x63/0x82
[  176.137121]  [<ffffffff810852d2>] warn_slowpath_common+0x82/0xc0
[  176.137123]  [<ffffffff8108541a>] warn_slowpath_null+0x1a/0x20
[  176.137125]  [<ffffffff811a2504>] __alloc_pages_nodemask+0xd14/0xe00
[  176.137128]  [<ffffffff810ddaef>] ? msg_print_text+0xdf/0x1a0
[  176.137132]  [<ffffffff8117bc3e>] ? irq_work_queue+0x8e/0xa0
[  176.137133]  [<ffffffff810de04f>] ? console_unlock+0x20f/0x550
[  176.137137]  [<ffffffff811edbdc>] alloc_pages_current+0x8c/0x110
[  176.137139]  [<ffffffffc0024000>] ? 0xffffffffc0024000
[  176.137141]  [<ffffffff8119ca2e>] __get_free_pages+0xe/0x40
[  176.137143]  [<ffffffffc0024020>] tim_init+0x20/0x1000 [tim]
[  176.137146]  [<ffffffff81002125>] do_one_initcall+0xb5/0x200
[  176.137149]  [<ffffffff811f90c5>] ? kmem_cache_alloc_trace+0x185/0x1f0
[  176.137151]  [<ffffffff81196eb5>] do_init_module+0x5f/0x1cf
[  176.137154]  [<ffffffff81111b05>] load_module+0x22e5/0x2960
[  176.137156]  [<ffffffff8110e080>] ? __symbol_put+0x60/0x60
[  176.137159]  [<ffffffff81221710>] ? kernel_read+0x50/0x80
[  176.137161]  [<ffffffff811123c4>] SYSC_finit_module+0xb4/0xe0
[  176.137163]  [<ffffffff8111240e>] SyS_finit_module+0xe/0x10
[  176.137167]  [<ffffffff8186179b>] entry_SYSCALL_64_fastpath+0x22/0xcb
[  176.137169] ---[ end trace 6aa0b905b8418c7b ]---
[  176.137170] big = 0

奇怪的是,再次尝试会产生...

# insmod tim.ko
insmod: ERROR: could not insert module tim.ko: Input/output error
...and dmesg just shows:

[  302.068396] Hello Tim!
[  302.068398] big = 0

为什么没有堆栈转储第二个(和后续)尝试?

1 个答案:

答案 0 :(得分:3)

简短的版本是__GFP_DIRECT_RECLAIM(也由__GFP_RECLAIM提供)是必要的,因为最终调用了dma_alloc_contiguous,并且它通过调用gfpflags_allow_blocking来检查阻塞没关系。我使用了提供GFP_KERNEL的常规__GFP_RECLAIM | __GFP_IO | __GFP_FS。但是,在此之前,必须先用dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64))DMA_BIT_MASK(64)来呼叫DMA_BIT_MASK(32)

    err = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64));
    if (err) {
        printk(KERN_INFO "[%s:probe] dma_set_mask returned: %d\n", DRIVER_NAME, err);
        return -EIO;
    }
    vaddr = dma_alloc_coherent(&pdev->dev, dbsize, paddr, GFP_KERNEL);
    if (!vaddr) {
        printk(KERN_ALERT "[%s:probe] failed to allocate coherent buffer\n", DRIVER_NAME);
        return -EIO;
    }

    iowrite32(paddr, ctx->bar0_base_addr + 0x140); // tell card where to DMA from

在Ubuntu 16.04和18.04上使用CMA分配不合理的大DMA区域

1. Rebuild Kernel
    1. Use uname -a to ascertain your current kernel version
    2. Issue apt install linux-source-[version] to fetch the kernel source
    3. copy /boot/config-[version] to /usr/src/linux-source-[version]/.config
    4. edit .config
        1. Locate CONFIG_DMA_CMA is not set
        2. change to CONFIG_DMA_CMA=y
    5. build kernel
        1. make -j[2 × # of cores]
        2. make -j[2 × # of cores] modules
        3. make install
    6. You have rebuilt the kernel

2. Configure CMA to reserve RAM
    1. Edit /etc/defualt/grub
        1. Locate GRUB_CMDLINE_LINUX=""
        2. Change to GRUB_CMDLINE_LINUX="cma=33G"
        3. use your desired CMA reserved RAM in place of 33G
    2. Issue update-grub
    3. Reboot
    4. Issue dmesg | grep cma
        1. Look for Memory: 30788784K/67073924K available (14339K kernel code, 2370K rwdata, 4592K rodata, 2696K init, 5044K bss, 1682132K reserved, 34603008K cma-reserved
        2. note: This example reserves 33G
    5. You have configured CMA to hold back RAM from the normal allocation subsystems

3. Alter your kernel module (driver) source
    1. Inform the kernel that the card can address 64b
    2. In your probe function locate a line like dma_alloc_coherent(…
    3. A few lines before that you may find dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32))
    4. change this to dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64))
    5. You have informed the kernel that the card in question is not restricted to low memory
    6. dma_alloc_coherent(&pdev->dev, dbsize, paddr, GFP_KERNEL)
    7. dbsize may specify up to 32G
    8. Recompile your kernel module (driver) and test